Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
183 views
in Technique[技术] by (71.8m points)

r - Group by multiple columns and sum other multiple columns

I have a data frame with about 200 columns, out of them I want to group the table by first 10 or so which are factors and sum the rest of the columns.

I have list of all the column names which I want to group by and the list of all the cols which I want to aggregate.

The output format that I am looking for needs to be the same dataframe with same number of cols, just grouped together.

Is there a solution using packages data.table, plyr or any other?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The data.table way is :

DT[, lapply(.SD,sum), by=list(col1,col2,col3,...)]

or

DT[, lapply(.SD,sum), by=colnames(DT)[1:10]]

where .SD is the (S)ubset of (D)ata excluding group columns. (Aside: If you need to refer to group columns generically, they are in .BY.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...