Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
311 views
in Technique[技术] by (71.8m points)

python - How to add rows for a timeseries dataframe?

I am writing a program that will load in a timeseries excel file into a dataframe, then I create several new columns using some basic calculations. My program is going to sometimes read in excel files that are missing months for some records. So in example below I have monthly sales data for two different stores. The stores open during different months, so their first month-end date will differ. But both should have month end data up until 9/30/2020. In my file, Store BBB has no records for 8/31/2020 and 9/30/2020 because there were no Sales during those months.

Store Month Opened State City Month End Date Sales
AAA 5/31/2020 NY New York 5/31/2020 1000
AAA 5/31/2020 NY New York 6/30/2020 5000
AAA 5/31/2020 NY New York 7/30/2020 3000
AAA 5/31/2020 NY New York 8/31/2020 4000
AAA 5/31/2020 NY New York 9/30/2020 2000
BBB 6/30/2020 CT Hartford 6/30/2020 100
BBB 6/30/2020 CT Hartford 7/30/2020 200

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  1. just try upsample of the DateTime index. ref: pandas-resample-upsample-last-date-edge-of-data
# group by `Store`
# with `Month End Date` column show be converted to DateTime

group.set_index(['Month End Date']).resample('M').asfreq()
  1. be notice that: 7/30/2020 is not the end day of July. 7/31/2020 is. so Using this method 7/30/2020 will be a problem(convert the Month End Date as the truely end date).

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...