This can be broken down into a two step problem. First, if we create an indexing column based off of the rle
, then we can use that to group by and run the cumsum
. The group by can then be performed by any number of aggregation techniques. I'll show two options, one using data.table
and the other using plyr
.
library(data.table)
library(plyr)
#data.table is the same thing as a data.frame for most purposes
#Fake data
dat <- data.table(dir = sample(-1:1, 20, TRUE), value = rnorm(20))
dir.rle <- rle(dat$dir)
#Compute an indexing column to group by
dat <- transform(dat, indexer = rep(1:length(dir.rle$lengths), dir.rle$lengths))
#What does the indexer column look like?
> head(dat)
dir value indexer
[1,] 1 0.5045807 1
[2,] 0 0.2660617 2
[3,] 1 1.0369641 3
[4,] 1 -0.4514342 3
[5,] -1 -0.3968631 4
[6,] -1 -2.1517093 4
#data.table approach
dat[, cumsum(value), by = indexer]
#plyr approach
ddply(dat, "indexer", summarize, V1 = cumsum(value))
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…