Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
129 views
in Technique[技术] by (71.8m points)

r - Package environments are not working as expected in a pipeline

In a package I'm working on, I'm using environments to save and retrieve the labels of a dataframe.

In a magrittr pipeline, I want to save them in an environment variable which I would retrieve later.

However, I'm facing a problem: it seems as if the environment variables were not modified until the end of the pipeline.

Here is an example, with most of useful functions:

devtools::install_github("DanChaltiel/crosstable", build_vignettes=TRUE)
library(crosstable) #for functions set_label() and get_label() but you can test 
                    #with other label-management packages (Hmisc, expss...)

labels_env = rlang::new_environment()
save_labels = function(.tbl){
    labels_env$last_save = tibble(
        name=names(.tbl),
        label=get_label(.tbl)[.data$name]
    )
    invisible(.tbl)
}
get_last_save = function(){
    labels_env$last_save
}
import_labels = function(.tbl){
    data_label = get_last_save()
    for(i in 1:nrow(data_label)){
        name = as.character(data_label[i, name_from])
        label = as.character(data_label[i, label_from])
        .tbl[name] = set_label(.tbl[name], label)
    }
    .tbl
}

This works exactly as intended, as label for disp would be NULL otherwise:

library(dplyr)
library(crosstable)
save_labels(mtcars2)
mtcars2 %>%
  transmute(disp=as.numeric(disp)+1) %>%  #removes the label attribute of disp
  import_labels() %>% #
  crosstable(disp)
#>    .id                 label   variable               value
#> 1 disp Displacement (cu.in.)  Min / Max        72.1 / 473.0
#> 2 disp Displacement (cu.in.)  Med [IQR] 197.3 [121.8;327.0]
#> 3 disp Displacement (cu.in.) Mean (std)       231.7 (123.9)
#> 4 disp Displacement (cu.in.)     N (NA)              32 (0)

Created on 2021-01-26 by the reprex package (v0.3.0)

However, save_labels(mtcars2) returns mtcars2 invisibly so I'd like to be able to pipe the whole sequence. Unfortunately, this throws an error:

library(dplyr)
library(crosstable)
mtcars2 %>%
  save_labels() %>% 
  transmute(disp=as.numeric(disp)+1) %>%
  import_labels() %>% #
  crosstable(disp)
#> Error in .subset2(x, i, exact = exact): attempt to select less than one element in get1index

Created on 2021-01-26 by the reprex package (v0.3.0)

Indeed, when using pipes, the environment variable is not set yet when we get to import_labels(). If I re-run this code, it won't throw any error but that would be misleading as it would refer to the previous value of labels_env$last_save.

My understanding of pipes is not good enough to get this working. Moreover, it seems to be specific to the package environment, as I could not reproduce this behavior in a plain R script.

Is there a way I can use pipes with such an environment variable inside a package?

question from:https://stackoverflow.com/questions/65924644/package-environments-are-not-working-as-expected-in-a-pipeline

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This was actually caused by a breaking change when the package magrittr (which provides pipes) went from v1.5 to v2.0.

This was explained on the blog and on NEWS.md.

A more specific reproducible example can be found on this GitHub issue.

In the new magrittr version, the evaluation sequenced has changed, so for side effects to happen in the correct order, you have to force the evaluation:

import_labels = function(.tbl){
    force(.tbl) #force evaluation
    data_label = get_last_save()
    for(i in 1:nrow(data_label)){
        name = as.character(data_label[i, name_from])
        label = as.character(data_label[i, label_from])
        .tbl[name] = set_label(.tbl[name], label)
    }
    .tbl
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...