How can I collapse my data frame where many observations have multiple rows but at most only one value for each of several different variables?
Here's what I have:
id title info var1 var2 var3
1 foo Some string here string 1
1 foo Some string here string 2
1 foo Some string here string 3
2 bar A different string string 4 string 5
2 bar A different string string 6
3 baz Something else string 7 string 8
Here's what I want:
id title info var1 var2 var3
1 foo Some string here string 1 string 2 string 3
2 bar A different string string 4 string 5 string 6
3 baz Something else string 7 string 8
I think I've got it with
ddply(merged, .(id, title, info), summarize, var1 = max(var1), var2 = max(var2), var3 = max(var3))
But the problem is that there are many more of the var1-var3 variables, and they are programmatically generated. As a result, I need a way to insert var1 = max(var1)
, etc. programmatically, based on an list of the variable names.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…