Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.4k views
in Technique[技术] by (71.8m points)

python - Excel Column Number to Text

When using openpyxl, there are times that I can only refer to columns in Excel worksheets via their column name, not their numerical index. It took me quite a bit of tinkering with failed solutions until I came up with a general approach that works.

I'd like to archive this information for anyone else who needs it.

Solution

Input: Column number (with 1 = A, 2 = B, ...)
Output: Column name in excel

__alphabet = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
def convertColumnNumToLetter(column):
    converted = ""
    
    #figure out the width of the converted text
    columnCount = 1
    base = len(__alphabet)
    base_exponent = base
    exponent = 1
    while column > base_exponent:
        column = column - base_exponent
        exponent = exponent + 1
        columnCount = columnCount + 1
        base_exponent = base_exponent * base
    
    #calculate the actual column name
    column = column - 1
    while len(converted) < columnCount:
        digit = column % base
        column = (column - digit) // base
        converted = __alphabet[digit] + converted
    
    return converted

The above calculates the column name by first figuring out the width of the column name, then using a standard base-n calculation with A=0, B=1, ..., Z=25 (forcing padding with 0s up to the length of the column name).

Figuring out the width of the column name:

1 - Width columns (Textual name, index)

A - 0
B - 1
C - 2
...
Z - 25 Note: these can be calculated with a standard base-26 calculation

2 - Width columns (Textual name, index)

AA - 26 = 26 + 0 Note: The 26 here represents the 26 A -> Z columns we just passed
AB - 27 = 26 + 1
AC - 28 = 26 + 2
...
AZ - 51 = 26 + 25
BA - 52 = 26 + 26 + 0 Note: The second 26 represents the AA -> AZ columns we just passed
BB - 53 = 26 + 26 + 1
...
BZ - 77 = 26 + 26 + 25
...
ZZ - ? = 26 + 26 * 25 + 25 Note: The 26 times 25 are all the AA -> BZ columns we just passed

Note: Ignoring the first 26 in each sum for the 2 - Width columns, we can see that the 2 - Width columns are just base-26 values (padded to width-2).

Conclusion

As it turns out, this pattern continues. The above solution, therefore, works based off the observation that the columns of width 1 are < 26, those of width 2 are < 26 + 26^2, those of width 3 are < 26 + 26^2 + 26^3.

If we calculate the column-width, we've noted that we can just calculate the base-26 column name (with A=0, B=1, ..., Z=25), pad the values to the appropriate width, and we receive the correct column name.

For the interested party, here is the inverse function (calculating the column number from the excel column name):

Solution

Input: Column name in excel
Output: Column number (with 1 = A, 2 = B, ...)

def convertLetterToColumnNum(columnName):
    converted = 0
    columnCount = len(columnName)
    
    base = len(__alphabet)
    base_exponent = 1
    while len(columnName) != 0:
        #strip the right-most digit, convert to index
        digit = columnName[-1:]
        columnName = columnName[:-1]
        digit = __alphabet.index(digit)
        
        #add the value it represents to the total, increment base_exponent
        converted = converted + digit * base_exponent
        base_exponent = base_exponent * base
        
        #add the offset for having passed all the n-width columns
        if len(columnName) != 0:
            converted = converted + base_exponent
    
    return converted + 1

Seeing as these are rather quick algorithms, I'd be interested to see if anyone knew of a solution that was demonstrably faster (when calculating a ton of these in the 3-width range, you might need such a thing).

P.S.
It should be noted that these algorithms work for any alphabet that uses the excel algorithm of naming, notably that:

  • The alphabet is strictly ordered (A < B < C < D ... < Z for example)
  • Columns are named in a gear fashion where the first length-of-alphabet values use 1 gear, the next length-of-alphabet^2 use 2 gears and so on
  • Columns are named in order where the n gears all start on the least element and continue upwards
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...