This is a small example. In my larger dataset, I have multiple years of data and the number of observations per group (div) are not always equal.
Example data:
set.seed(1)
df<-data.frame(
year = 2014,
id = sample(LETTERS[1:26], 12),
div = rep(c("1", "2a", "2b"), each=4),
pts = c(9,7,9,3,7,5,3,7,2,7,7,1),
x = c(10,12,11,7,7,5,4,12,4,6,7,2)
)
df
# year id div pts x
#1 2014 G 1 9 10
#2 2014 J 1 7 12
#3 2014 N 1 9 11
#4 2014 U 1 3 7
#5 2014 E 2a 7 7
#6 2014 S 2a 5 5
#7 2014 W 2a 3 4
#8 2014 M 2a 7 12
#9 2014 L 2b 2 4
#10 2014 B 2b 7 6
#11 2014 D 2b 7 7
#12 2014 C 2b 1 2
I want to rank this data such that individuals in div 1 are ranked higher than div 2a/2b, and within div 1 individuals are ranked 1,2,3,4 based on highest number of 'pts' followed by highest number of 'x'.
Individuals in div 2a and div 2b should be ranked individually also based on the same criteria. This would look like this:
df %>%
group_by(div) %>%
arrange(desc(pts), desc(x)) %>%
mutate(position = row_number(div))
# year id div pts x position
#1 2014 N 1 9 11 1
#2 2014 G 1 9 10 2
#3 2014 J 1 7 12 3
#4 2014 U 1 3 7 4
#5 2014 M 2a 7 12 1
#6 2014 E 2a 7 7 2
#7 2014 S 2a 5 5 3
#8 2014 W 2a 3 4 4
#9 2014 D 2b 7 7 1
#10 2014 B 2b 7 6 2
#11 2014 L 2b 2 4 3
#12 2014 C 2b 1 2 4
However, I want to produce a final column/variable that is another rank. This would rank all individuals in div 1 as higher than 2a/2b, but 2a/2b are equal. i.e. individuals who are 1 in 2a/2b should now get 5.5, individuals who are ranked 2 should now get 7.5. There are always an equal number of individuals in div2a and div2b for all years.
It should look like this:
# year id div pts x position final
#1 2014 N 1 9 11 1 1.0
#2 2014 G 1 9 10 2 2.0
#3 2014 J 1 7 12 3 3.0
#4 2014 U 1 3 7 4 4.0
#5 2014 M 2a 7 12 1 5.5
#6 2014 E 2a 7 7 2 7.5
#7 2014 S 2a 5 5 3 9.5
#8 2014 W 2a 3 4 4 11.5
#9 2014 D 2b 7 7 1 5.5
#10 2014 B 2b 7 6 2 7.5
#11 2014 L 2b 2 4 3 9.5
#12 2014 C 2b 1 2 4 11.5
I need to find a dplyr
solution ideally. Also, it does need to generalize to years where the number of individuals in 'div1' may vary and the number of individuals in div2a/div2b varies (although length(div2a)==length(div2b) always).
See Question&Answers more detail:
os