Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
406 views
in Technique[技术] by (71.8m points)

Combining duplicated rows in R and adding new column containing IDs of duplicates

I have a data frame that looks like this:

Chr start   stop    ref alt Hom/het ID  
chr1    5179574 5183384 ref Del Het 719  
chr1    5179574 5184738 ref Del Het 915  
chr1    5179574 5184738 ref Del Het 951  
chr1    5336806 5358384 ref Del Het 376  
chr1    5347979 5358384 ref Del Het 228  

I would like to merge any duplicate rows, combining the last ID column so that all IDs are in one row/column, like this:

Chr start   stop    ref alt Hom/het ID  
chr1    5179574 5183384 ref Del Het 719  
chr1    5179574 5184738 ref Del Het 915, 951 
chr1    5336806 5358384 ref Del Het 376  
chr1    5347979 5358384 ref Del Het 228  

I have found examples of people removing duplicates and summing a column, but I just want to combine all IDs with duplicate regions in a list in a single column.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Some call to aggregate() should do the trick.

Here's an option that collects the ID's in a list object:

(df1 <- aggregate(df[7], df[-7], unique))
#   Chr   start    stop ref alt Hom.het       ID
# 1 chr1 5179574 5183384 ref Del     Het      719
# 2 chr1 5179574 5184738 ref Del     Het 915, 951
# 3 chr1 5336806 5358384 ref Del     Het      376
# 4 chr1 5347979 5358384 ref Del     Het      228

And here's one that collects them in a character vector:

df2 <- aggregate(df[7], df[-7], 
                 FUN = function(X) paste(unique(X), collapse=", "))

Comparing the results of the two options:

str(df1$ID)
# List of 4
#  $ 0: int 719
#  $ 3: int [1:2] 915 951
#  $ 7: int 376
#  $ 8: int 228

str(df2$ID)
# chr [1:4] "719" "915, 951" "376" "228"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...