I would like to understand how to pass strings representing expressions into dplyr, so that the variables mentioned in the string are evaluated as expressions on columns in the dataframe. The main vignette on this topic covers passing in quosures, and doesn't discuss strings at all.
It's clear that quosures are safer and clearer than strings when representing expressions, so of course we should avoid strings when quosures can be used instead. However, when working with tools outside the R ecosystem, such as javascript or YAML config files, one will often have to work with strings instead of quosures.
For example, say I want a function that does a grouped tally using expressions passed in by the user/caller. As expected, the following code doesn't work, since dplyr uses nonstandard evaluation to interpret the arguments to group_by
.
library(tidyverse)
group_by_and_tally <- function(data, groups) {
data %>%
group_by(groups) %>%
tally()
}
my_groups <- c('2 * cyl', 'am')
mtcars %>%
group_by_and_tally(my_groups)
#> Error in grouped_df_impl(data, unname(vars), drop): Column `groups` is unknown
In dplyr 0.5 we would use standard evaluation, such as group_by_(.dots = groups)
, to handle this situation. Now that the underscore verbs are deprecated, how should we do this kind of thing in dplyr 0.7?
In the special case of expressions that are just column names we can use the solutions to this question, but they don't work for more complex expressions like 2 * cyl
that aren't just a column name.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…