I have the following data:
Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,12,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D")
data <- data.frame(Name, Age, Group)
And I'd like to use dplyr to
(1) group the data by "Group"
(2) show the min and max Age within each Group
(3) show the Name of the person with the min and max ages
The following code does this:
data %>% group_by(Group) %>%
summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))],
maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
Which works well:
Group minAge minAgeName maxAge maxAgeName
1 A 22 Sam 22 Sam
2 B 12 Sarah 58 James
3 C 17 Andrew 82 Sally
4 D 12 Mairin 67 Ray
However, I have a problem if there are multiple min or max values:
Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,31,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D")
data <- data.frame(Name, Age, Group)
> data %>% group_by(Group) %>%
+ summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))],
+ maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
Error: expecting a single value
I'm looking for two solutions:
(1) where it doesn't matter which min or max name is shown, just that one is shown (i.e., the first value found)
(2) where if there are "ties" all minimum values and maximum values are shown
Please let me know if this isn't clear and thanks in advance!
See Question&Answers more detail:
os