Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
453 views
in Technique[技术] by (71.8m points)

rowsum based on groupings or conditions in r

I want to do in based on column names.

I have more than 50 columns and have looked at various solutions, including this.

However, this doesn't really answer my question. I have column names such as: total_2012Q1, total_2012Q2, total_2012Q3, total_2012Q4 ,..., up to total_2014Q4, and other character variables. I want to add rows by year, so in the end, I would have three year columns: total_2012, total_2013, total_2014.

I don't want to and select something like ..sample[,2:5]. Is there a way I can sum them without manually going through column numbers? Also, is an option but if there are character variables as well, how do you deal only the int variables you want to sum up?

simple reproducible example (pre):

id total_2012Q1 total_2012Q2 total_2013Q1 total_2013Q2 char1 char2
 1         1231         5455         1534         2436    N     Y
 2         3948         1239          223          994    Y     N

reproducible example (post):

id total_2012 total_2013 char1 char2
 1       6686      3970     N     Y
 2       5187      1217     Y     N

Thanks for any suggestions.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use split.default, i.e.

sapply(split.default(df, sub('^.*_([0-9]+)Q[0-9]', '\1', names(df))), rowSums)
#     2012 2013
#[1,]    3   23
#[2,]    7   37
#[3,]    9   49

DATA:

dput(df)
structure(list(total_2012Q1 = c(1, 2, 3), total_2012Q2 = c(2, 
5, 6), total_2013Q1 = c(12, 15, 16), total_2013Q2 = c(11, 22, 
33)), class = "data.frame", row.names = c(NA, -3L))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...