This issue is raised over and over again, but unfortunately no satisfying answer has been made which can be an appropriate duplicate target. Looks like I need to write one.
Most people know this is related to "contrasts", but not everyone knows why it is needed, and how to understand its result. We have to look at model matrix in order to fully digest this.
Suppose we are interested in a model with two factors: ~ f + g
(numerical covariates do not matter so I include none of them; the response does not appear in model matrix, so drop it, too). Consider the following reproducible example:
set.seed(0)
f <- sample(gl(3, 4, labels = letters[1:3]))
# [1] c a a b b a c b c b a c
#Levels: a b c
g <- sample(gl(3, 4, labels = LETTERS[1:3]))
# [1] A B A B C B C A C C A B
#Levels: A B C
We start with a model matrix with no contrasts at all:
X0 <- model.matrix(~ f + g, contrasts.arg = list(
f = contr.treatment(n = 3, contrasts = FALSE),
g = contr.treatment(n = 3, contrasts = FALSE)))
# (Intercept) f1 f2 f3 g1 g2 g3
#1 1 0 0 1 1 0 0
#2 1 1 0 0 0 1 0
#3 1 1 0 0 1 0 0
#4 1 0 1 0 0 1 0
#5 1 0 1 0 0 0 1
#6 1 1 0 0 0 1 0
#7 1 0 0 1 0 0 1
#8 1 0 1 0 1 0 0
#9 1 0 0 1 0 0 1
#10 1 0 1 0 0 0 1
#11 1 1 0 0 1 0 0
#12 1 0 0 1 0 1 0
Note, we have:
unname( rowSums(X0[, c("f1", "f2", "f3")]) )
# [1] 1 1 1 1 1 1 1 1 1 1 1 1
unname( rowSums(X0[, c("g1", "g2", "g3")]) )
# [1] 1 1 1 1 1 1 1 1 1 1 1 1
So span{f1, f2, f3} = span{g1, g2, g3} = span{(Intercept)}
. In this full specification, 2 columns are not identifiable. X0
will have column rank 1 + 3 + 3 - 2 = 5
:
qr(X0)$rank
# [1] 5
So, if we fit a linear model with this X0
, 2 coefficients out of 7 parameters will be NA
:
y <- rnorm(12) ## random `y` as a response
lm(y ~ X - 1) ## drop intercept as `X` has intercept already
#X0(Intercept) X0f1 X0f2 X0f3 X0g1
# 0.32118 0.05039 -0.22184 NA -0.92868
# X0g2 X0g3
# -0.48809 NA
What this really implies, is that we have to add 2 linear constraints on 7 parameters, in order to get a full rank model. It does not really matter what these 2 constraints are, but there must be 2 linearly independent constrains. For example, we can do either of the following:
- drop any 2 columns from
X0
;
- add two sum-to-zero constrains on parameters, like we require coefficients for
f1
, f2
and f3
sum to 0, and the same for g1
, g2
and g3
.
- use regularization, for example, adding ridge penalty to
f
and g
.
Note, these three ways end up with three different solutions:
- contrasts;
- constrained least squares;
- linear mixed models or penalized least squares.
The first two are still in the scope of fixed effect modelling. By "contrasts", we reduce the number of parameters until we get a full rank model matrix; while the other two does not reduce the number of parameters, but effectively reduces the effective degree of freedom.
Now, you are certainly after the "contrasts" way. So, remember, we have to drop 2 columns. They can be
- one column from
f
and one column from g
, giving to a model ~ f + g
, with f
and g
contrasted;
- intercept, and one column from either
f
or g
, giving to a model ~ f + g - 1
.
Now you should be clear, that within the framework of dropping columns, there is no way you can get what you want, because you are expecting to drop only 1 column. The resulting model matrix will still be rank-deficient.
If you really want to have all coefficients there, use constrained least squares, or penalized regression / linear mixed models.
Now, when we have interaction of factors, things are more complicated but the idea is still the same. But given that my answer is already long enough, I don't want to continue.