Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
213 views
in Technique[技术] by (71.8m points)

R - How do I run a regression based on a correlation matrix rather than raw data?

I would like to run a regression based on a correlation matrix rather than raw data. I have looked at this post, but can't make sense of it. How do I do this in R?

Here is some code:

#Correlation matrix.
MyMatrix <- matrix(
            c(1.0, 0.1, 0.5, 0.4,
              0.1, 1.0, 0.9, 0.3,
              0.5, 0.9, 1.0, 0.3,
              0.4, 0.3, 0.3, 1.0), 
            nrow=4, 
            ncol=4)

df <- as.data.frame(MyMatrix)

colnames(df)[colnames(df)=="V1"] <- "a"
colnames(df)[colnames(df)=="V2"] <- "b"
colnames(df)[colnames(df)=="V3"] <- "c"
colnames(df)[colnames(df)=="V4"] <- "d"

#Assume means and standard deviations as follows:
MEAN.a <- 4.00
MEAN.b <- 3.90
MEAN.c <- 4.10
MEAN.d <- 5.00
SD.a <- 1.01
SD.b <- 0.95
SD.c <- 0.99
SD.d <- 2.20

#Run model [UNSURE ABOUT THIS PART]
library(lavaan)
m1 <- 'd ~ a + b + c'
fit <- sem(m1, ????)
summary(fit, standardize=TRUE)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This should do it. First you can convert your correlation matrix to a covariance matrix

MyMatrix <- matrix(
  c(1.0, 0.1, 0.5, 0.4,
    0.1, 1.0, 0.9, 0.3,
    0.5, 0.9, 1.0, 0.3,
    0.4, 0.3, 0.3, 1.0), 
  nrow=4, 
  ncol=4)
rownames(MyMatrix) <- colnames(MyMatrix) <- c("a", "b","c","d")

#Assume means and standard deviations as follows:
MEAN.a <- 4.00
MEAN.b <- 3.90
MEAN.c <- 4.10
MEAN.d <- 5.00
SD.a <- 1.01
SD.b <- 0.95
SD.c <- 0.99
SD.d <- 2.20
s <- c(SD.a, SD.b, SD.c, SD.d)
m <- c(MEAN.a, MEAN.b, MEAN.c, MEAN.d)
cov.mat <- diag(s) %*% MyMatrix %*% diag(s)
rownames(cov.mat) <- colnames(cov.mat) <- rownames(MyMatrix)
names(m) <- rownames(MyMatrix)

Then, you can use lavaan to estimate the model along the lines of the post you mentioned in your question. Note, you need to supply a number of observations to get the sample estimate. I used 100 for the example, but you may want to change it if that doesn't make sense.

library(lavaan)
m1 <- 'd ~ a + b + c'
fit <- sem(m1, 
           sample.cov = cov.mat, 
           sampl.nobs=100, 
           sample.mean=m
           meanstructure=TRUE)
summary(fit, standardize=TRUE)
# lavaan 0.6-6 ended normally after 44 iterations
# 
# Estimator                                         ML
# Optimization method                           NLMINB
# Number of free parameters                          5
# 
# Number of observations                           100
# 
# Model Test User Model:
#   
# Test statistic                                 0.000
# Degrees of freedom                                 0
# 
# Parameter Estimates:
#   
# Standard errors                             Standard
# Information                                 Expected
# Information saturated (h1) model          Structured
# 
# Regressions:
#                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
# d ~                                                                   
#   a                 6.317    0.095   66.531    0.000    6.317    2.900
#   b                12.737    0.201   63.509    0.000   12.737    5.500
#   c               -13.556    0.221  -61.307    0.000  -13.556   -6.100
# 
# Intercepts:
#                 Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
# .d               -14.363    0.282  -50.850    0.000  -14.363   -6.562
# 
# Variances:
#                 Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
# .d                 0.096    0.014    7.071    0.000    0.096    0.020
# 
# 


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...