In R, what exactly is the problem with having variables with the same name as base R functions?

Question

Welcome To Ask or Share your Answers For Others

In R, what exactly is the problem with having variables with the same name as base R functions?

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

In R, what exactly is the problem with having variables with the same name as base R functions?

It seems to be generally considered poor programming practise to use variable names that have functions in base R with the same name.

For example, it is tempting to write:

data <- data.frame(...)
df   <- data.frame(...)

Now, the function data loads data sets while the function df computes the f density function.

Similarly, it is tempting to write:

a <- 1
b <- 2
c <- 3

This is considered bad form because the function c will combine its arguments.

But: In that workhorse of R functions, lm, to compute linear models, data is used as an argument. In other words, data becomes an explicit variable inside the lm function.

So: If the R core team can use identical names for variables and functions, what stops us mere mortals?

The answer is not that R will get confused. Try the following example, where I explicitly assign a variable with the name c. R doesn't get confused at all with the difference between variable and function:

c("A", "B")
[1] "A" "B"

c <- c("Some text", "Second", "Third")
c(1, 3, 5)
[1] 1 3 5

c[3]
[1] "Third"

The question: What exactly is the problem with having variable with the same name as base R function?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-16T23:59:20+0000

There isn't really one. R will not normally search objects (non function objects) when looking for a function:

> mean(1:10)
[1] 5.5
> mean <- 1
> mean(1:10)
[1] 5.5
> rm(mean)
> mean(1:10)
[1] 5.5

The examples shown by @Joris and @Sacha are where poor coding catches you out. One better way to write foo is:

foo <- function(x, fun) {
    fun <- match.fun(fun)
    fun(x)
}

Which when used gives:

> foo(1:10, mean)
[1] 5.5
> mean <- 1
> foo(1:10, mean)
[1] 5.5

There are situations where this will catch you out, and @Joris's example with na.omit is one, which IIRC, is happening because of the standard, non-standard evaluation used in lm().

Several Answers have also conflated the T vs TRUE issue with the masking of functions issue. As T and TRUE are not functions that is a little outside the scope of @Andrie's Question.

Categories

In R, what exactly is the problem with having variables with the same name as base R functions?

In R, what exactly is the problem with having variables with the same name as base R functions?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags