I have a dataframe in the form:
Sales House Station Day Date Time Daypart Total Unique Key
0 CARLTON CARLTON Mon 3AUG20 1213 DAYTIME 0 CARLTON_ 3AUG20
1 CARLTON CARLTON Mon 3AUG20 2307 POSTPEAK 30 CARLTON_ 3AUG20
2 CARLTON CARLTON Tue 4AUG20 1015 COFFEE 30 NaN
3 CARLTON CARLTON Tue 4AUG20 1027 COFFEE 30 CARLTON_ 4AUG20
4 CARLTON CARLTON Wed 5AUG20 1310 DAYTIME 30 CARLTON_ 5AUG20
The Unique Key
column is just a column I have added to try make this process easier (correct me if I am wrong please!). Essentially I would like to sum the Total
column by using the Unique Key
column, but also remove the extra rows associated with the Unique Key
and only leaving one..
As an example, the above df would come out as the below. In this instance there is a match for row 1 and row 2, which the Total
row should be summed, and then row 2 removed..
Sales House Station Day Date Time Daypart Total Unique Key
0 CARLTON CARLTON Mon 3AUG20 1213 DAYTIME 30 CARLTON_ 3AUG20
1 CARLTON CARLTON Tue 4AUG20 1015 COFFEE 30 NaN
2 CARLTON CARLTON Tue 4AUG20 1027 COFFEE 30 CARLTON_ 4AUG20
3 CARLTON CARLTON Wed 5AUG20 1310 DAYTIME 30 CARLTON_ 5AUG20
Is there a way to easily do this?
question from:
https://stackoverflow.com/questions/65917608/how-to-sum-by-column-in-pandas-df-and-remove-additional-rows 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…