Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
314 views
in Technique[技术] by (71.8m points)

python - openpyxl: read Excel calculated cell values (data_only=True) but detect format (e.g. percentage)

I found solutions for detecting several formats and data type on one hand and solutions to avoid formulas that I cannot calculate in Python on the other. However, I didn't find a solution that yields all formulas calculated and preserves format information, in my case if the number is a percentage value. Is this possible without reading the Excel file twice?

Currently, I use the following unsatisfying code (which does what I want but I consider it a mess):

@dataclass
class ExcelSheet:
    title: str
    lines: List [List [str]]

class ExcelDocument:
    def __init__ (self):
        self.absFilePath = ""
        self.sheets: List [ExcelSheet] = []

    def read (self, absFilePath: str):
        workBookValues = openpyxl.load_workbook (absFilePath,  read_only=True,  data_only=True)
        workBookAll = openpyxl.load_workbook (absFilePath)

        # copy all content into an ExcelSheet list;
        # both classes exist from xlrd times ("historic reasons") -
        # keep existing applications unchanged - and shall not be removed)
        for sheet in workBookAll:
            self.sheets.append (thisSheet := ExcelSheet (sheet.title,  []))

            iLine = 1
            for line in sheet.iter_rows ():
                thisSheet.lines.append ([])
                myLine: List [str] = thisSheet.lines [-1]

                iCell = 0
                for cell in line:
                    # convert a pure date from datetime to date in order to avoid datetimes with time 0:00:00
                    if cell.data_type == 'n'  and  '%' in cell.number_format:
                        myLine.append ((cell.value * 100,  '%'))

                    if (cell.data_type == 'd'  and
                        "h:m" not in cell.number_format  and
                        "m:s" not in cell.number_format  and
                        str (cell.value).endswith (" 00:00:00")):
                        myLine.append (cell.value.date ())

                    elif cell.data_type == 'f':
                        myLine.append (workBookValues [sheet.title] [iLine] [iCell].value)

                    else:
                        myLine.append (cell.value)
                    iCell+=1
                iLine+=1

By the way, if anyone could explain why this next, immediately following code fails (cannot be executed) as soon as I open workBookAll read_only, too, I would be grateful:

            for mergedCells in sheet.merged_cells.ranges:
                refCellValue: any = sheet [mergedCells.bounds [1]] [mergedCells.bounds [0] - 1].value

                for iLine in range (mergedCells.bounds [1],  mergedCells.bounds [3] + 1):
                    for iCell in range (mergedCells.bounds [0] - 1,  mergedCells.bounds [2]):
                        thisSheet.lines [iLine - 1] [iCell] = refCellValue
question from:https://stackoverflow.com/questions/65601557/openpyxl-read-excel-calculated-cell-values-data-only-true-but-detect-format

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...