Many thanks for reading. Apologies for what I'm sure is a simple task.
I have a dataframe:
(Edited: Added extra column not to be included in comparison)
b = c(5, 6, 7, 8, 10, 11)
c = c('david','alan','pete', 'ben', 'richard', 'edd')
d = c('alex','edd','ben','pete','raymond', 'alan')
df = data.frame(b, c, d)
df
b c d
1 5 david alex
2 6 alan edd
3 7 pete ben
4 8 ben pete
5 10 richard raymond
6 11 edd alan
I want to compare the group of columns c
and d
with the group of columns d
and c
. That is, for one row, I want to compare the combined values in c
and d
with the combined values in d
and c
for all other rows.
(Note the values could either be characters or integers)
Where these match I want to return the index of those rows which match, preferably as a list of lists. I need to be able to access the indexes without referring to the values in column c
or d
.
I.e. for the above dataframe, my expected output would be:
c(c(2, 6), c(3, 4))
((2,6), (3,4))
As:
Row 2: (c + d == alan + edd) = row 6: (d + c == edd + alan)
Row 3: (c + d == pete + ben) = row 4: (d + c == ben + pete)
I understand how to determine the match case for two separate columns using match
melt
, but not if they are joined together and iterating over all possible row combinations.
I envision something like:
lapply(1:6, function(x), ifelse((df$a & df$b) == (df$b & df$a), index(x), 0))
But obviously that is incorrect and won't work.
I consulted the following questions but have been unable to formulate an answer. I have no idea where to begin.
Matching multiple columns on different data frames and getting other column as result
match two columns with two other columns
Comparing two columns in a data frame across many rows
R Comparing each value of all pairs of columns
How can I achieve the above?
See Question&Answers more detail:
os