cid <int> | y <int> | x1 <dbl> | x2 <int> |
---|---|---|---|
1 | 10 | 0 | 2 |
1 | 5 | 0 | 0 |
1 | 5 | 0 | 1 |
1 | 7 | 0 | 2 |
2 | 5 | 1 | 2 |
2 | 5 | 1 | 2 |
2 | 5 | 1 | 2 |
2 | 8 | 1 | 1 |
3 | 8 | 1 | 0 |
3 | 5 | 1 | 0 |
Multicollinearity, Effect Size, Design Effects, and Power Analysis
University of Southern California
2024-03-22
y=Xγ+Zu+e
[105575558]=[102100101102112112112111][γ0γ1γ2]+[1210111211121212][u01u02u11u12]+[e11e12e13e14e21e22e23e24]
[105575558]=[102100101102112112112111][γ0γ1γ2]+[1210111211121212][u01u02u11u12]+[e11e12e13e14e21e22e23e24]
For cluster j, yj=Xjγ+Zjuj+ej
y=Xγ+Zu+e
In regression, we look at
If X1 and X2 are uncorrelated, xc1⊤xc2=0, and variance accounted for by the two predictors are separate
If X1 and X2 are correlated, xc1⊤xc2≠0, then adding X2 . . .
[105575558]=[102100101102112112112111][γ0γ1γ2]+[1210111211121212][u01u02u11u12]+[e11e12e13e14e21e22e23e24]
[105575558]=[1021210010101111021211211112121121211112][γ0γ1γ2u01u02u11u12]+[e11e12e13e14e21e22e23e24]
In MLM, we also look at
Z⊤Z, cross-products among columns of Z, and
Xc⊤Z, cross-products between columns of Xc and those of Z
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.67 -0.33 -0.33 0.00 0.00 0.00
[2,] -0.33 0.67 -0.33 0.00 0.00 0.00
[3,] -0.33 -0.33 0.67 0.00 0.00 0.00
[4,] 0.00 0.00 0.00 0.67 -0.33 -0.33
[5,] 0.00 0.00 0.00 -0.33 0.67 -0.33
[6,] 0.00 0.00 0.00 -0.33 -0.33 0.67
[1] -1 -1 -1 1 1 1
Level-2 predictor can only account for level-2 variance
[1] -2 0 2 -3 2 1
Pure level-1 predictor can only account for level-1 variance
[1] -1 1 2 -3 0 1
Level-1 predictor can account for both level-2 and level-1 variance
[,1] [,2]
[1,] -1 0
[2,] 1 0
[3,] 2 0
[4,] 0 -3
[5,] 0 0
[6,] 0 1
Using simulations, we found that, when clustering is ignored, SEs of the following are underestimated
Because they all correlate with Z!
Additional Questions
Conditional on X
V(Zu+e)=σ2V=σ2[Z(D⊗IJ)Z⊤+I],
Unconditional (Xc = grand-mean centered X)
Xcγγ⊤Xc⊤⏟fixed+σ2Z(D⊗IJ)Z⊤⏟random lv-2+σ2I⏟random lv-1.
Tr[V(y)]/N=γ⊤Xc⊤Xcγ/N+σ2Tr[Z(D⊗IJ)Z⊤]/N+σ2
Tr[V(y)]/N=γ⊤ΣXγ+σ2[Tr(DKZ)+1],
Let Xr be the design matrix of variables with random effects1
Unconditional or Conditional on the fixed effects
ICCD=Tr(DKZ)Tr(DKZ)+1,
Proportion of variance by fixed effects (see Johnson, 2014; Rights & Sterba, 2019)1
R2GLMM=γ⊤ΣXγγ⊤ΣXγ+σ2[Tr(DKZ)+1].
Code Example
Linear mixed model fit by REML ['lmerMod']
Formula: mAch ~ ses + meanses + (ses | school)
Data: Hsb82
REML criterion at convergence: 46561.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.1671 -0.7269 0.0163 0.7547 2.9646
Random effects:
Groups Name Variance Std.Dev. Corr
school (Intercept) 2.695 1.6418
ses 0.453 0.6731 -0.21
Residual 36.796 6.0659
Number of obs: 7185, groups: school, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.6740 0.1506 84.176
ses 2.1903 0.1218 17.976
meanses 3.7812 0.3826 9.882
[,1]
[1,] 8.190918
Code Example (cont’d)
library(r2mlm)
library(MuMIn)
# R^2 using formula, r2mlm::r2mlm(), and MuMIn::r.squaredGLMM()
list(formula = vf / (vf + vr + sigma(m1)^2),
r2mlm = r2mlm(m1),
MuMIn = MuMIn::r.squaredGLMM(m1))
$formula
[,1]
[1,] 0.170797
$r2mlm
$r2mlm$Decompositions
total
fixed 0.170816580
slope variation 0.005737776
mean variation 0.056202223
sigma2 0.767243421
$r2mlm$R2s
total
f 0.170816580
v 0.005737776
m 0.056202223
fv 0.176554356
fvm 0.232756579
$MuMIn
R2m R2c
[1,] 0.1708167 0.232756
Confidence interval of R2
lme4::bootMer()
or R package bootmlm)1Additional Questions
SMD = Mean difference / SDp1
For cluster-randomized trials
Yij=Tjγ+Z0ju0j+eij,V(u0j)=τ20
Tj = 0 (control) and 1 (treatment)
δ=γ√σ2[Tr(DKZ)+1]=γ√τ20+σ2
More general form:
δ=adjusted/unadjusted treatment effect√average conditional variance of y
δ=E[E(Y|T=1)−E(Y|T=0)|C]√Tr[V(Xγ|T)+V(Zu|T)]/N+σ2
where C is the subset of covariates for adjusted differences
f2GLMM=R2GLMM1−R2GLMM
Additional Questions
γs=γSDx√Tr[V(y)]/N
Linear mixed model fit by REML ['lmerMod']
Formula: mAch ~ ses + meanses + (ses | school)
Data: Hsb82
REML criterion at convergence: 46561.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.1671 -0.7269 0.0163 0.7547 2.9646
Random effects:
Groups Name Variance Std.Dev. Corr
school (Intercept) 2.695 1.6418
ses 0.453 0.6731 -0.21
Residual 36.796 6.0659
Number of obs: 7185, groups: school, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.6740 0.1506 84.176
ses 2.1903 0.1218 17.976
meanses 3.7812 0.3826 9.882
Correlation of Fixed Effects:
(Intr) ses
ses -0.084
meanses -0.013 -0.256
For cluster means and cluster-mean centered variables, it is still natural to use the total SD
Design effect: Expected impact of design on sampling variance of an estimator1
For the sample mean (ˆμ):
Deff(ˆμ)=VSRS(ˆγ0)VMLM(ˆμSRS)=1+(n−1)ICC
Deff(ˆμ) Does Not Inform About Random Slopes
ˆγ=(X⊤V−1X)−1X⊤V−1y
VMLM(ˆγGLS)=σ2(X⊤V−1X)−1VSRS(ˆγOLS)=˜σ2(X⊤X)−1
where ˜σ2=σ2[Tr(DKZ)+1]
Unpacking V(ˆγGLS)
σ2(X⊤V−1X)−1=σ2{X⊤[Z(D⊗IJ)Z⊤+I]−1X}−1=σ2{X⊤[I−Z(D⊗I−1J+Z⊤Z)Z⊤]X}−1=σ2{X⊤X−X⊤Z(D⊗I−1J+Z⊤Z)Z⊤X}−1=σ2{[(X⊤X)−1+(X⊤X)−1X⊤Z(D⊗IJ)Z⊤X(X⊤X)−1]−1}−1=σ2[(X⊤X)−1+(X⊤X)−1X⊤Z(D⊗IJ)Z⊤X(X⊤X)−1]
We can substitute X with Xc. As Z is block diagonal, Xc⊤Z = [Xc⊤1Z1|Xc⊤2Z2|⋯|Xc⊤JZJ], and
NG=Xc⊤Z(D⊗IJ)Z⊤Xc=J∑j=1Xc⊤jZjDZ⊤jXcj
is the cross-product matrix of Xc⊤Zu, and Gkl seems equal to
{˜nNΣXkZDΣ⊤XlZwhen both Xl and Xk are purely level-1˜nN[ΣXkZDΣ⊤XlZ+KXkZDK⊤XlZ]otherwise
where ˜n=∑Jj=1n2j/N (reduced to n if cluster sizes are equal across clusters). So
σ2(Xc⊤V−1Xc)−=σ2[Σ−1X+˜nΣ−1XGΣ−1X],
where G = (ΣXZDΣ⊤XZ+FKXZDK⊤XZF), F is a diagonal filtering matrix in the form diag[f0,f1,…,fp−1], where fk = 1 if Xk has a level-2 component and 0 otherwise.
The OLS standard error and the MLM standard error have different rates of converagence to 0
For coefficient γk
Deff(ˆγk)={Σ−1X}kk+n[Σ−1XGΣ−1X]kk{Σ−1X}kk[Tr(DKZ)+1]=(1−ICCD)+(1−ICCD)n[Σ−1XGΣ−1X]kkΣ−1Xkk
Example: Growth curve modeling with Tx × slope interaction
yi=[1Ti001Ti1Ti1Ti22Ti1Ti33Ti1Ti44Ti][γ0γ1γ2γ3]+[1011121314][u0u1]+[ei1ei2ei3ei4ei5]
γ=[1,0,−0.1,−0.3]⊤,D=[0.50.20.1],σ2=1.5
icc_d r2_GLMM
0.655 0.056
Example cont’d
# Design effect
sigmaxz <- sigmax[, c(1, 3)]
Kxz <- Kx[, c(1, 3)]
F_mat <- diag(c(0, 1, 0, 1)) # filtering
G_mat <- (sigmaxz %*% D_mat %*% t(sigmaxz)) +
F_mat %*% Kxz %*% D_mat %*% t(Kxz) %*% F_mat
sigmax_inv <- MASS::ginv(sigmax)
(1 - icc) * (1 + num_obs * diag(sigmax_inv %*% G_mat %*% sigmax_inv) /
diag(sigmax_inv))
[1] NaN 0.6321839 0.6896552 0.6896552
For H0: γk=0
If you have any suggestions or feedback, please email me at hokchiol@usc.edu
Multicollinearity with Crossed Random Effects
y=Xγ+ZAτA+ZBτB+ϵ
3 x 10 sparse Matrix of class "dgCMatrix"
a 2 2 2 2 2 2 2 2 2 2
b 2 2 2 2 2 2 2 2 2 2
c 2 2 2 2 2 2 2 2 2 2
3 x 10 Matrix of class "dgeMatrix"
A B C D E F G H I J
a 0 0 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0 0 0
c 0 0 0 0 0 0 0 0 0 0
(Unconditional) ICC and R2
From a random-intercepts model without fixed-effect predictors,
ICC = maximum R2 for a level-2 predictor
1 - ICC = maximum R2 for a purely level-1 predictor
(Conditional) ICC and R2
From a conditional model with fixed-effect predictors,
ICC = Proportion of variance accounted for by level-2 random effects
R2 = Proportion of variance accounted for by all fixed-effect predictors
With covariates X2, Yij=Tjγ1+x⊤2ijγ2+Z0ju0j+eij
δ=γ1√γ⊤2ΣX2γ2+τ20+σ2
With random slopes, Yij=Tjγ1+x⊤2ijγ2+z⊤ijuj+eij
δ=γ1√γ⊤2ΣX2γ2+σ2[Tr(DKZ)+1]
Code example (data)
Code example (cont’d)
library(lme4)
# Fit MLM
m1 <- lmer(y ~ treat + x2 + (x2 | clus_id), data = sim_data)
# Variance by x2
Sigma_x <- with(sim_data, { x2c <- x2 - mean(x2); mean(x2c^2) })
va_x2 <- Sigma_x * fixef(m1)[["x2"]]^2
# Variance by random intercepts and random slopes
# Using parameterization in lme4
Kz <- crossprod(cbind(1, sim_data$x2)) / nrow(sim_data)
va_z <- sum(VarCorr(m1)$clus_id * Kz) # Tr(Tau, Kz)
# Show all components
c(gamma_1 = fixef(m1)[["treat"]], va_x2 = va_x2,
va_z = va_z, va_e = sigma(m1)^2)
gamma_1 va_x2 va_z va_e
0.257714138 0.001712517 0.537828861 0.989420368
[1] 0.2084203