I have a data frame with different IDs and I want to make a subgroup in which: for each ID I will only obtain one row with the closest value to 0.5 in variable Y.
This is my data frame:
df <- data.frame(ID=c("DB1", "BD1", "DB2", "DB2", "DB3", "DB3", "DB4", "DB4", "DB4"), X=c(0.04, 0.10, 0.10, 0.20, 0.02, 0.30, 0.01, 0.20, 0.30),
Y=c(0.34, 0.49, 0.51, 0.53, 0.48, 0.49, 0.49, 0.50, 1.0)
)
This is what I want to get
ID X Y
DB1 0.10 0.49
DB2 0.10 0.51
DB3 0.30 0.49
DB4 0.20 0.50
I know I can add a filter with ddply using something like this
ddply(df, .(ID), function(z) {
z[z$Y == 0.50, ][1, ]
})
and this would work fine if there were always a 0.50 value in Y, which is not the case.
How do change the == for a "nearest to" 0.5, or is there another function I could use instead?
Thank you in advance!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…