Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
432 views
in Technique[技术] by (71.8m points)

r - How do I create new rows based on cell value?

I have a dataframe df where:

 Days    Treatment A   Treatment B   Treatment C 
 0           5             1             1   
 1           0             2             3      
 2           1             1             0         

For example, there were 5 individuals receiving Treatment A that survived 0 days and 1 who survived 2, etc. However, I would like it where those 5 individuals now become a unique row, with that cell representing the days they survived:

 Patient #   A   B   C       
 1           0        
 2           0  
 3           0      
 4           0        
 5           0   
 6           2      
 7               0         
 8               1   
 9               1     
 10              2        
 11                  0
 12                  1   
 13                  1      
 14                  1

Let Patient # = an arbitrary value.

I am sorry if this is not descriptive enough, but I appreciate any and all help you have to offer! I have the dataset in Excel at the moment, but I can place it into R if that's easier.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

We can replicate values the 'Days' with each of the 'Patient' column values in a list, then create a list of the sequence, use Map to construct a data.frame and finally use bind_rows

library(dplyr)
lst1 <- lapply(df[-1], function(x) rep(df$Days, x))
bind_rows(Map(function(x, y, z) setNames(data.frame(x, y), 
   c("Patient", z)),  relist(seq_along(unlist(lst1)), 
       skeleton = lst1), lst1, sub("Treatment\s+", "", names(lst1))))

-output

#    Patient  A  B  C
#1        1  0 NA NA
#2        2  0 NA NA
#3        3  0 NA NA
#4        4  0 NA NA
#5        5  0 NA NA
#6        6  2 NA NA
#7        7 NA  0 NA
#8        8 NA  1 NA
#9        9 NA  1 NA
#10      10 NA  2 NA
#11      11 NA NA  0
#12      12 NA NA  1
#13      13 NA NA  1
#14      14 NA NA  1

Or another option with reshaping into 'long' and then to 'wide'

library(tidyr)
df %>%
    pivot_longer(cols = -Days) %>% 
    separate(name, into = c('name1', 'name2')) %>% 
    group_by(name2) %>%
    summarise(value = rep(Days, value), .groups = 'drop') %>% 
    mutate(Patient = row_number()) %>%
    pivot_wider(names_from = name2, values_from = value)

-output

# A tibble: 14 x 4
#   Patient     A     B     C
#     <int> <int> <int> <int>
# 1       1     0    NA    NA
# 2       2     0    NA    NA
# 3       3     0    NA    NA
# 4       4     0    NA    NA
# 5       5     0    NA    NA
# 6       6     2    NA    NA
# 7       7    NA     0    NA
# 8       8    NA     1    NA
# 9       9    NA     1    NA
#10      10    NA     2    NA
#11      11    NA    NA     0
#12      12    NA    NA     1
#13      13    NA    NA     1
#14      14    NA    NA     1

data

df <- structure(list(Days = 0:2, `Treatment A` = c(5L, 0L, 1L), 
  `Treatment B` = c(1L, 
2L, 1L), `Treatment C` = c(1L, 3L, 0L)), class = "data.frame", row.names = c(NA, 
-3L))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...