I have an unusual situation that has cropped up today, both on newly created Power BI files and in existing files that were previously working without issue.
When importing string
data, the dictionary size of the column is over a megabyte regardless of the data that is imported. This obviously causes a significant model size bloat for any small tables that have a lot of columns.
This issue is occurring in data imported from SQL Server, Synapse, Data Lake Gen2 and local filestore.
The effect can be seen in the Col Size
values below for all the string
columns, and how they bear no resemblance to the differences in Cardinality
. As a result of this, importing a single 1,206Kb csv file results in a 38.15Mb model size.
Has anyone else had this issue or know how to rectify? The only thing I can think that has changed (besides a minor backgroud update) is upgrading to the new model view, though I did that about a week ago and this has only presented itself today...
VertiPaq Analyzer metrics of a new model with one small csv loaded:
Power BI details:
Release:
December 2020
Product Version:
2.88.1144.0 (20.12) (x64)
OS Version:
Microsoft Windows NT 10.0.18363.0 (x64 en-GB)
CLR Version:
4.7 or later [Release Number = 528040]
Model Default Mode:
Import
Model Version:
PowerBI_V3
Is Report V3 Models Enabled:
True
Enabled Preview Features:
PBI_NewWebTableInference
PBI_v3ModelsPreview
Disabled Preview Features:
PBI_shapeMapVisualEnabled
PBI_SpanishLinguisticsEnabled
PBI_JsonTableInference
PBI_ImportTextByExample
PBI_ExcelTableInference
PBI_qnaLiveConnect
PBI_eimInformationProtectionForDesktop
PBI_azureMapVisual
PBI_dataPointLassoSelect
PBI_compositeModelsOverAS
PBI_narrativeTextBox
PBI_dynamicParameters
PBI_anomalyDetection
PBI_newFieldList
PBI_cartesianMultiplesAuthoring
Disabled DirectQuery Options:
TreatHanaAsRelationalSource
question from:
https://stackoverflow.com/questions/65901618/power-bi-dictionary-size-for-string-columns-all-over-1mb-despite-small-data 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…