I think curly curly is a good successor to bang bang. Nevertheless, I am still struggling to understand the tidyverse NSE.
Let's say I want to make a simple function that runs on a dataframe, takes multiple columns, reshapes them in long format and makes factors out of the resulting reshaped columns. I would also want the capability of defining the factor levels, or by default leaving them in the order the selected columns were before the reshaping process.
I write the same function using bang bang and curly.
library(tidyverse)
df_test <- data.frame(
id = c("id1", "id2"),
item1 = c(0, 4),
item2 = c(3, 2),
item3 = c(1, 4),
item4 = c(3, 4),
item5 = c(1, NA)
)
# Bang bang way
vars_factor_to_long <- function(data, vars) {
vars_enq <- rlang::enquo(vars)
vars_name <- unname(tidyselect::vars_select(unique(names(data)), !!vars_enq))
data <-
data %>%
tidyr::pivot_longer(cols = !!vars_enq, names_to = "item", values_to = "value") %>%
dplyr::mutate(item = factor(item, levels = vars_name))
data
}
vars_factor_to_long(df_test, item1:item5)
#> # A tibble: 10 x 3
#> id item value
#> <fct> <fct> <dbl>
#> 1 id1 item1 0
#> 2 id1 item2 3
#> 3 id1 item3 1
#> 4 id1 item4 3
#> 5 id1 item5 1
#> 6 id2 item1 4
#> 7 id2 item2 2
#> 8 id2 item3 4
#> 9 id2 item4 4
#> 10 id2 item5 NA
# Curly curly way works the same, but doesn't need the enquo
vars_factor_to_long2 <- function(data, vars) {
vars_name <- unname(tidyselect::vars_select(unique(names(data)), {{vars}}))
data <-
data %>%
tidyr::pivot_longer(cols = {{vars}}, names_to = "item", values_to = "value") %>%
dplyr::mutate(item = factor(item, levels = vars_name))
data
}
vars_factor_to_long2(df_test, item1:item5)
#> # A tibble: 10 x 3
#> id item value
#> <fct> <fct> <dbl>
#> 1 id1 item1 0
#> 2 id1 item2 3
#> 3 id1 item3 1
#> 4 id1 item4 3
#> 5 id1 item5 1
#> 6 id2 item1 4
#> 7 id2 item2 2
#> 8 id2 item3 4
#> 9 id2 item4 4
#> 10 id2 item5 NA
Created on 2021-02-05 by the reprex package (v0.3.0)
This works well, but I found out that the data-variables can be tunneled inside strings with curly curly by using a syntax similar to glue
. For example:
# Curly curly - tunneling data-variable inside strings with glue-like syntax
mean_by <- function(data, by, vars) {
data %>%
group_by({{ by }}) %>%
summarise("{{ vars }}" := mean({{ vars }}, na.rm = TRUE))
}
mean_by(df_test, id, item1)
#> # A tibble: 2 x 2
#> id item1
#> <fct> <dbl>
#> 1 id1 0
#> 2 id2 4
mean_by(df_test, id, item1:item2)
#> # A tibble: 2 x 2
#> id `item1:item2`
#> <fct> <dbl>
#> 1 id1 1.5
#> 2 id2 3
Created on 2021-02-05 by the reprex package (v0.3.0)
Is there a way I can tunnel the data-variable names (as in the second example) to be used as factor levels in the functions from the first example? I suspect there would be a problem with this sort of tunneling if there are multiple columns provided, but still, how can I tunnel at least one column name to the RHS?
Any discussion on this new rlang topic would help me understand more.
Thank you.
question from:
https://stackoverflow.com/questions/66063204/rlang-curly-curly-operator-and-tunneling-data-variables-inside-strings-on-rhs