Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
488 views
in Technique[技术] by (71.8m points)

r - Using group_by with difftime

I've created a dataframe with data :

idCol <- c('1','1','2','2')
stepCol <- c('step1' , 'step2' , 'step1' , 'step2')
timestampCol <- c('01-01-2017:09.00', '01-01-2017:10.00', '01-01-2017:09:00', '01-01-2017:14.00')
mydata <- data.frame(idCol , stepCol , timestampCol)
colnames(mydata) <- c('id' , 'steps' , 'timestamp')

stepCol is the start time for a given id, when step2 begins this means step1 has ended. I'm attempting to generate a tibble that contains the average of the duration for each id based on step start time.

So I'm attempting to generate :

step , averagetime
step1 , 1 hour
step2 , 5 hours

Closest I've got is :

diffTime <- c(0, difftime(ymd_hms(mydata$timestamp[-1]), ymd_hms(mydata$timestamp[-nrow(mydata)]), units="hours"))
diffTime %>% group_by(id, steps) %>% summarize(mean(diffTime))

But returns error :

Error in UseMethod("group_by_") : 
  no applicable method for 'group_by_' applied to an object of class "c('double', 'numeric')"
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I did some minor editing to your code but basically you need to associate the results of ymd_hms with your mydata:

mydata$diffTime <- c(0, difftime(lubridate::ymd_hms(mydata$timestamp[-1]), 
                          lubridate::ymd_hms(mydata$timestamp[-nrow(mydata)]), units="hours"))
diffTime <- mydata %>% group_by(id) %>% summarize(mean(diffTime))

Returns:

R> diffTime
# A tibble: 2 x 2
     id `mean(diffTime)`
  <chr>            <dbl>
1     1         0.008333
2     2         0.033333

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...