Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
264 views
in Technique[技术] by (71.8m points)

r - rlang: Curly curly operator and tunneling data-variables inside strings on RHS

I think curly curly is a good successor to bang bang. Nevertheless, I am still struggling to understand the tidyverse NSE.

Let's say I want to make a simple function that runs on a dataframe, takes multiple columns, reshapes them in long format and makes factors out of the resulting reshaped columns. I would also want the capability of defining the factor levels, or by default leaving them in the order the selected columns were before the reshaping process.

I write the same function using bang bang and curly.

library(tidyverse)

df_test <- data.frame(
  id = c("id1", "id2"),
  item1 = c(0, 4),
  item2 = c(3, 2),
  item3 = c(1, 4),
  item4 = c(3, 4),
  item5 = c(1, NA)
)

# Bang bang way
vars_factor_to_long <- function(data, vars) {                     
  vars_enq <- rlang::enquo(vars)
  vars_name <- unname(tidyselect::vars_select(unique(names(data)), !!vars_enq))
  data <-
    data %>%
    tidyr::pivot_longer(cols = !!vars_enq, names_to = "item", values_to = "value") %>%
    dplyr::mutate(item = factor(item, levels = vars_name)) 
  data
}

vars_factor_to_long(df_test, item1:item5)
#> # A tibble: 10 x 3
#>    id    item  value
#>    <fct> <fct> <dbl>
#>  1 id1   item1     0
#>  2 id1   item2     3
#>  3 id1   item3     1
#>  4 id1   item4     3
#>  5 id1   item5     1
#>  6 id2   item1     4
#>  7 id2   item2     2
#>  8 id2   item3     4
#>  9 id2   item4     4
#> 10 id2   item5    NA

# Curly curly way works the same, but doesn't need the enquo
vars_factor_to_long2 <- function(data, vars) {                     
  vars_name <- unname(tidyselect::vars_select(unique(names(data)), {{vars}}))
  data <-
    data %>%
    tidyr::pivot_longer(cols = {{vars}}, names_to = "item", values_to = "value") %>%
    dplyr::mutate(item = factor(item, levels = vars_name))
    data
}

vars_factor_to_long2(df_test, item1:item5)
#> # A tibble: 10 x 3
#>    id    item  value
#>    <fct> <fct> <dbl>
#>  1 id1   item1     0
#>  2 id1   item2     3
#>  3 id1   item3     1
#>  4 id1   item4     3
#>  5 id1   item5     1
#>  6 id2   item1     4
#>  7 id2   item2     2
#>  8 id2   item3     4
#>  9 id2   item4     4
#> 10 id2   item5    NA

Created on 2021-02-05 by the reprex package (v0.3.0)

This works well, but I found out that the data-variables can be tunneled inside strings with curly curly by using a syntax similar to glue. For example:

# Curly curly - tunneling data-variable inside strings with glue-like syntax
mean_by <- function(data, by, vars) {
  data %>%
    group_by({{ by }}) %>%
    summarise("{{ vars }}" := mean({{ vars }}, na.rm = TRUE))
}

mean_by(df_test, id, item1)
#> # A tibble: 2 x 2
#>   id    item1
#>   <fct> <dbl>
#> 1 id1       0
#> 2 id2       4
mean_by(df_test, id, item1:item2)
#> # A tibble: 2 x 2
#>   id    `item1:item2`
#>   <fct>         <dbl>
#> 1 id1             1.5
#> 2 id2             3

Created on 2021-02-05 by the reprex package (v0.3.0)

Is there a way I can tunnel the data-variable names (as in the second example) to be used as factor levels in the functions from the first example? I suspect there would be a problem with this sort of tunneling if there are multiple columns provided, but still, how can I tunnel at least one column name to the RHS?

Any discussion on this new rlang topic would help me understand more. Thank you.

question from:https://stackoverflow.com/questions/66063204/rlang-curly-curly-operator-and-tunneling-data-variables-inside-strings-on-rhs

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You need to tunnel the user selection and then get the names in any case, so I think your method is good. I would simplify it a bit by using dplyr instead of the low level tidyselect though:

vars_factor_to_long2 <- function(data, vars) {
  vars <- names(dplyr::select(data, {{ vars }}))

  data %>%
    tidyr::pivot_longer(
      cols = all_of(vars),
      names_to = "item",
      values_to = "value"
    ) %>%
    dplyr::mutate(
      item = factor(item, levels = vars)
    )
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...