r - Automated formula construction

Question

Welcome To Ask or Share your Answers For Others

r - Automated formula construction

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Automated formula construction

In a recent homework assignment, we were instructed to run 27 linear models, each time adding an additional variable (the goal was to plot the changes in R² vs. changes in adjusted R²). I found it difficult to algorithmically create formulas like this. The code I ended up using looked like this following (note that the first column in the data frame is the dependent variable, all the rest are prospective independent variables.

 make.formula <- function(howfar) {
  formula <- c()
  for (i in 1:howfar) {
    if (i == 1) {
      formula <- paste(formula, names(d)[i], '~')}
    else if (i == howfar) {
      formula <- paste(formula, names(d)[i], '')
    }
    else {
      formula <- paste(formula, names(d)[i], '+')}
  }
  return(formula)
}

formulas <- lapply(seq(2, length(d)), make.formula)
formulas <- lapply(formulas, as.formula)
fits <- lapply(formulas, lm, data = d)

This works, but seems far from ideal, and my impression is that anything I'm doing with a for-loop in R is probably not being done the best way. Is there an easier way to algorithmically construct formulas for a given data frame?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T03:05:37+0000

reformulate(), a nifty function for creating formulas from character vectors, might come in handy. Here's an example of what it does:

reformulate(response="Y", termlabels=c("X1", "X2", "X3"))
# Y ~ X1 + X2 + X3

And here's how you might use it in practice. (Note that I here create the formulas inside of the lm() calls. Because formula objects carry with them info about the environment they were created in, I'd be a bit hesitant to create them outside of the lm() call within which you actually want to use them.):

evars <- names(mtcars)[2:5]
ii <- lapply(1:4, seq_len)

lapply(ii, 
       function(X) {
          coef(lm(reformulate(response="mpg", termlabels=evars[X]), data=mtcars))
})
# [[1]]
# (Intercept)         cyl 
#    37.88458    -2.87579 
# 
# [[2]]
# (Intercept)         cyl        disp 
# 34.66099474 -1.58727681 -0.02058363 
# 
# [[3]]
# (Intercept)         cyl        disp          hp 
# 34.18491917 -1.22741994 -0.01883809 -0.01467933 
# 
# [[4]]
# (Intercept)         cyl        disp          hp        drat 
# 23.98524441 -0.81402201 -0.01389625 -0.02317068  2.15404553

Categories

r - Automated formula construction

r - Automated formula construction

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags