In a package I'm working on, I'm using environments to save and retrieve the labels of a dataframe.
In a magrittr
pipeline, I want to save them in an environment variable which I would retrieve later.
However, I'm facing a problem: it seems as if the environment variables were not modified until the end of the pipeline.
Here is an example, with most of useful functions:
devtools::install_github("DanChaltiel/crosstable", build_vignettes=TRUE)
library(crosstable) #for functions set_label() and get_label() but you can test
#with other label-management packages (Hmisc, expss...)
labels_env = rlang::new_environment()
save_labels = function(.tbl){
labels_env$last_save = tibble(
name=names(.tbl),
label=get_label(.tbl)[.data$name]
)
invisible(.tbl)
}
get_last_save = function(){
labels_env$last_save
}
import_labels = function(.tbl){
data_label = get_last_save()
for(i in 1:nrow(data_label)){
name = as.character(data_label[i, name_from])
label = as.character(data_label[i, label_from])
.tbl[name] = set_label(.tbl[name], label)
}
.tbl
}
This works exactly as intended, as label for disp
would be NULL
otherwise:
library(dplyr)
library(crosstable)
save_labels(mtcars2)
mtcars2 %>%
transmute(disp=as.numeric(disp)+1) %>% #removes the label attribute of disp
import_labels() %>% #
crosstable(disp)
#> .id label variable value
#> 1 disp Displacement (cu.in.) Min / Max 72.1 / 473.0
#> 2 disp Displacement (cu.in.) Med [IQR] 197.3 [121.8;327.0]
#> 3 disp Displacement (cu.in.) Mean (std) 231.7 (123.9)
#> 4 disp Displacement (cu.in.) N (NA) 32 (0)
Created on 2021-01-26 by the reprex package (v0.3.0)
However, save_labels(mtcars2)
returns mtcars2
invisibly so I'd like to be able to pipe the whole sequence. Unfortunately, this throws an error:
library(dplyr)
library(crosstable)
mtcars2 %>%
save_labels() %>%
transmute(disp=as.numeric(disp)+1) %>%
import_labels() %>% #
crosstable(disp)
#> Error in .subset2(x, i, exact = exact): attempt to select less than one element in get1index
Created on 2021-01-26 by the reprex package (v0.3.0)
Indeed, when using pipes, the environment variable is not set yet when we get to import_labels()
. If I re-run this code, it won't throw any error but that would be misleading as it would refer to the previous value of labels_env$last_save
.
My understanding of pipes is not good enough to get this working. Moreover, it seems to be specific to the package environment, as I could not reproduce this behavior in a plain R script.
Is there a way I can use pipes with such an environment variable inside a package?
question from:
https://stackoverflow.com/questions/65924644/package-environments-are-not-working-as-expected-in-a-pipeline