r - Conditionally apply pipeline step depending on external value

Question

Welcome To Ask or Share your Answers For Others

r - Conditionally apply pipeline step depending on external value

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Conditionally apply pipeline step depending on external value

Given the dplyr workflow:

require(dplyr)                                      
mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    filter(grepl(x = model, pattern = "Merc")) %>% 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))

I'm interested in conditionally applying filter depending on the value of applyFilter.

Solution

For applyFilter <- 1 the rows are filtered with use of the "Merc" string, without the filter all rows are returned.

applyFilter <- 1


mtcars %>%
  tibble::rownames_to_column(var = "model") %>%
  filter(model %in%
           if (applyFilter) {
             rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
           } else
           {
             rownames(mtcars)
           }) %>%
  group_by(am) %>%
  summarise(meanMPG = mean(mpg))

Problem

The suggested solution is inefficient as the ifelse call is always evaluated; a more desireable approach would only evaluate the filter step for applyFilter <- 1.

Attempt

The inefficient working solution would look like that:

mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    # Only apply filter step if condition is met
    if (applyFilter) { 
        filter(grepl(x = model, pattern = "Merc"))
        }
    %>% 
    # Continue 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))

Naturally, the syntax above is incorrect. It's only a illustration how the ideal workflow should look.

Desired answer

I'm not interested in creating an interim object; the workflow should resemble:

startingObject
    %>%
    ...
    conditional filter
    ...
    final object

Ideally, I would like to arrive at solution where I can control whether the filter call is being evaluated or not

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T19:18:14+0000

How about this approach:

mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>% 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))

This means grepl is only evaluated if the applyfilter is 1, otherwise the filter simply recycles a TRUE.

Or another option is to use {}:

mtcars %>% 
  tibble::rownames_to_column(var = "model") %>% 
  {if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>% 
  group_by(am) %>% 
  summarise(meanMPG = mean(mpg))

There's obviously another possible approach in which you would simply break the pipe, conditionally do the filter and then continue the pipe (I know OP didn't ask for this, just want to give another example for other readers)

mtcars %<>% 
  tibble::rownames_to_column(var = "model")

if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))

mtcars %>% 
  group_by(am) %>% 
  summarise(meanMPG = mean(mpg))

Categories

r - Conditionally apply pipeline step depending on external value

r - Conditionally apply pipeline step depending on external value

Solution

Problem

Attempt

Desired answer

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags