Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
736 views
in Technique[技术] by (71.8m points)

r - Count number of time combination of events appear in dataframe columns ext

This is an extension of the question asked in Count number of times combination of events occurs in dataframe columns, I will reword the question again so it is all here:

I have a data frame and I want to calculate the number of times each combination of events in two columns occur (in any order), with a zero if a combination doesn't appear.

For example say I have

df <- data.frame('x' = c('a', 'b', 'c', 'c', 'c'), 
                 'y' = c('c', 'c', 'a', 'a', 'b'))

So

x y  
a c  
b c  
c a  
c a  
c a  
c b

a and b do not occur together, a and c 4 times (rows 2, 4, 5, 6) and b and c twice (3rd and 7th rows) so I would want to return

x-y num  
a-b 0  
a-c 4  
b-c 2  

I hope this makes sense? Thanks in advance

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This should do it:

res = table(df)

To convert to data frame:

resdf = as.data.frame(res)

The resdf data.frame looks like:

  x y Freq
1 a a    0
2 b a    0
3 c a    2
4 a b    0
5 b b    0
6 c b    1
7 a c    1
8 b c    1
9 c c    0

Note that this answer takes order into account. If ordering of the columns is unimportant, then modifying the original data.frame prior to the process will remove the effect of ordering (a-c treated the same as c-a).

df1 = as.data.frame(t(apply(df,1,sort)))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...