Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
429 views
in Technique[技术] by (71.8m points)

recursion - R: split string into numeric and return the mean as a new column in a data frame

I have a large data frame with columns that are a character string of numbers such as "1, 2, 3, 4". I wish to add a new column that is the average of these numbers. I have set up the following example:

     set.seed(2015)
     library(dplyr)
     a<-c("1, 2, 3, 4", "2, 4, 6, 8", "3, 6, 9, 12")
     df<-data.frame(a)
     df$a <- as.character(df$a)

Now I can use strsplit to split the string and return the mean for a given row where the [[1]] specifies the first row.

    mean(as.numeric(strsplit((df$a), split=", ")[[1]]))
    [1] 2.5

The problem is when I try to do this in a data frame and reference the row number I get an error.

    > df2<- df %>%
    +   mutate(index = row_number(),
    +          avg = mean(as.numeric(strsplit((df$a), split=", ")
    [[index]])))
    Error in strsplit((df$a), split = ", ")[[1:3]] : 
      recursive indexing failed at level 2

Can anyone explain this error and why I cannot index using a variable? If I replace index with a constant it works, it seems to not like me using a variable there.

Much thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Try:

library(dplyr)
library(splitstackshape)

df %>%
  mutate(index = row_number()) %>%
  cSplit("a", direction = "long") %>%
  group_by(index) %>%
  summarise(mean = mean(a))

Which gives:

#Source: local data table [3 x 2]
#
#  index mean
#1     1  2.5
#2     2  5.0
#3     3  7.5

Or as per @Ananda's suggestion:

> rowMeans(cSplit(df, "a"), na.rm = T)
# [1] 2.5 5.0 7.5

If you want to keep the result in a data frame you could do:

df %>% mutate(mean = rowMeans(cSplit(., "a"), na.rm = T))

Which gives:

#            a mean
#1  1, 2, 3, 4  2.5
#2  2, 4, 6, 8  5.0
#3 3, 6, 9, 12  7.5

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...