Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
101 views
in Technique[技术] by (71.8m points)

python - groupby() cumsum() with 4 columns

What is the correct syntax for this code:

db_deals['nem_col'] = db_deals.groupby(['client_id', 'acc_number'])[['swap','profit']].cumsum()

I want to sum the cumulative values of swap and profit columns, for unique client_id and acc_number columns.

The error message is:

ValueError: Wrong number of items passed 2, placement implies 1
question from:https://stackoverflow.com/questions/66067688/groupby-cumsum-with-4-columns

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  1. You can use agg() and state which functions you want to apply to which columns.
  2. sum(axis=1) to sum columns output from step #1
  3. sort_values() to demonstrate it's correct
df = pd.DataFrame({"client_id":np.random.randint(1,3, 15),
             "acc_number":np.random.randint(150,155, 15),
                           "swap":np.random.randint(2,8, 15),
                           "profit":np.random.randint(50,75, 15),
             })

df = (df.assign(nem_col=
           df.groupby(["client_id","acc_number"]).agg({c:"cumsum" for c in ["swap","profit"]}).sum(axis=1))
 .sort_values(["client_id","acc_number"])
)

client_id acc_number swap profit nem_col
0 1 150 6 56 62
4 1 150 3 54 119
7 1 150 5 68 192
1 1 152 6 67 73
5 1 152 2 70 145
13 1 152 2 66 213
14 1 152 4 68 285
2 1 153 3 67 70
8 1 153 7 62 139
12 1 153 2 74 215
6 2 150 7 70 77
3 2 152 2 50 52
10 2 153 6 53 59
11 2 153 2 63 124
9 2 154 5 52 57

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...