Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
377 views
in Technique[技术] by (71.8m points)

dplyr - Finding the highest sum of consecutive non-zero numbers by group in R

I have a long list of results from serial observations by groups. The dataset looks like here:

df <- data.frame (ID = c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3),
                  EV = c(0,0,3,2,3,0,1,1,1,2,0,0,1,0,0,0,4,5,1,2,0,2,2,2,2))

ID #1 showed two consecutive series of non-zero numbers, 3-2-3 and 1-1. The sum of 3-2-3 is higher than 1-1, I want 8. ID #2 had 1-2 and 1, therefore the correct result is 3. ID #3 has 4-5-1-2 and 2-2-2-2, and so the desired result is 12.

I want to have a dataframe as output like here;

  ID RE
1  1  8
2  2  3
3  3 12

Thanks in advance.

question from:https://stackoverflow.com/questions/66051817/finding-the-highest-sum-of-consecutive-non-zero-numbers-by-group-in-r

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I've got a data.table approach.

library( data.table )
#make it a data.table
setDT(df)
#make groups based on group == 0 and get their sums
ans <- df[, .(sum = sum(EV)), by = .(ID, rleid( ID, EV == 0 ) ) ]
#get the maximum sum for each ID, and drop the rleid column created in the previous row
ans[ ans[, .I[sum == max(sum)], by=.(ID)]$V1][, rleid := NULL][]
#    ID sum
# 1:  1   8
# 2:  2   3
# 3:  3  12

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...