I found solutions for detecting several formats and data type on one hand and solutions to avoid formulas that I cannot calculate in Python on the other. However, I didn't find a solution that yields all formulas calculated and preserves format information, in my case if the number is a percentage value. Is this possible without reading the Excel file twice?
Currently, I use the following unsatisfying code (which does what I want but I consider it a mess):
@dataclass
class ExcelSheet:
title: str
lines: List [List [str]]
class ExcelDocument:
def __init__ (self):
self.absFilePath = ""
self.sheets: List [ExcelSheet] = []
def read (self, absFilePath: str):
workBookValues = openpyxl.load_workbook (absFilePath, read_only=True, data_only=True)
workBookAll = openpyxl.load_workbook (absFilePath)
# copy all content into an ExcelSheet list;
# both classes exist from xlrd times ("historic reasons") -
# keep existing applications unchanged - and shall not be removed)
for sheet in workBookAll:
self.sheets.append (thisSheet := ExcelSheet (sheet.title, []))
iLine = 1
for line in sheet.iter_rows ():
thisSheet.lines.append ([])
myLine: List [str] = thisSheet.lines [-1]
iCell = 0
for cell in line:
# convert a pure date from datetime to date in order to avoid datetimes with time 0:00:00
if cell.data_type == 'n' and '%' in cell.number_format:
myLine.append ((cell.value * 100, '%'))
if (cell.data_type == 'd' and
"h:m" not in cell.number_format and
"m:s" not in cell.number_format and
str (cell.value).endswith (" 00:00:00")):
myLine.append (cell.value.date ())
elif cell.data_type == 'f':
myLine.append (workBookValues [sheet.title] [iLine] [iCell].value)
else:
myLine.append (cell.value)
iCell+=1
iLine+=1
By the way, if anyone could explain why this next, immediately following code fails (cannot be executed) as soon as I open workBookAll read_only, too, I would be grateful:
for mergedCells in sheet.merged_cells.ranges:
refCellValue: any = sheet [mergedCells.bounds [1]] [mergedCells.bounds [0] - 1].value
for iLine in range (mergedCells.bounds [1], mergedCells.bounds [3] + 1):
for iCell in range (mergedCells.bounds [0] - 1, mergedCells.bounds [2]):
thisSheet.lines [iLine - 1] [iCell] = refCellValue
question from:
https://stackoverflow.com/questions/65601557/openpyxl-read-excel-calculated-cell-values-data-only-true-but-detect-format 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…