University of Southern California
March 27, 2024
\[\newcommand{\bv}[1]{\boldsymbol{\mathbf{#1}}}\]
2S-PA as an alternative to joint SEM modeling
Example 1: Categorical indicators violating measurement invariance
Example 2: Growth modeling of latent constructs
Extensions & Limitations
But constructs are typically not directly observed
But, imperfect measurement leads to biased and spurious results
\[ \begin{aligned} \text{Measurement: } & \tilde{\bv \eta}_i = \bv \Lambda^*_{\color{red}i} \bv \eta^*_i + \bv \varepsilon^*_i \\ & \bv \varepsilon^*_i \sim N(\bv 0, \bv \Theta^*_{\color{red}i}) \\ \text{Structural: } & \bv \eta^*_{i} = \bv \alpha^* + \bv B^* \bv \eta^*_{i} + \bv \zeta^*_{i} \end{aligned} \]
Multiple-group latent regression
Challenges with JM
F1 SE_F1
[1,] -0.9050792 0.6705626
[2,] 0.1212936 0.4302879
[3,] -0.9050792 0.6705626
[4,] 0.2327338 0.4089029
[5,] 0.2327338 0.4089029
[6,] 1.4184295 0.3354633
We set \(V(\eta)\) = 1. As inputs for 2S-PA, we need to obtain \(\lambda^*_i\) and \(\tilde \theta^*_i\) as
F1 SE_F1 loading_i errorvar_i
1 -0.905 0.671 0.550 0.247
2 0.121 0.430 0.815 0.151
3 -0.905 0.671 0.550 0.247
4 0.233 0.409 0.833 0.139
5 0.233 0.409 0.833 0.139
6 1.418 0.335 0.887 0.100
Generalizing to multidimensional measurement models
Software usually gives \(\text{ACOV}(\tilde {\bv \eta}_i)\) as output
Implementation in R package R2spa
# Prepare data
fs_dat <- fs_dat |>
within(expr = {
rel_class <- 1 - class_se^2
rel_audit <- 1 - audit_se^2
ev_class <- class_se^2 * (1 - class_se^2)
ev_audit <- audit_se^2 * (1 - audit_se^2)
})
# Define model
latreg_umx <- umxLav2RAM(
"
fs_audit ~ fs_class
fs_audit + fs_class ~ 1
",
printTab = FALSE
)
# lambda (reliability)
cross_load <- matrix(c("rel_audit", NA, NA, "rel_class"), nrow = 2) |>
`dimnames<-`(rep(list(c("fs_audit", "fs_class")), 2))
# Error of factor scores
err_cov <- matrix(c("ev_audit", NA, NA, "ev_class"), nrow = 2) |>
`dimnames<-`(rep(list(c("fs_audit", "fs_class")), 2))
# Create model in Mx
tspa_mx <- tspa_mx_model(latreg_umx,
data = fs_dat,
mat_ld = cross_load, mat_vc = err_cov
)
est | se | ci | |
---|---|---|---|
Joint Modeling1 | 0.614 | 0.030 | [0.556, 0.672] |
Factor score regression2 | 0.543 | 0.024 | [0.495, 0.590] |
2S-PA2 | 0.669 | 0.027 | [0.617, 0.722] |
2S-PA is Flexible
But Choices Needed To Be Made . . .
Joint: Multidimensional model
Separate: Several unidimensional models
cf. Lai et al. (2023)
Composite scores | Regression scores1 | Bartlett scores2 | |
---|---|---|---|
Observed variance | \(\bv 1^\top \bv \Sigma_X \bv 1\) | \(\psi^2 \bv \lambda^\top \bv \Sigma_X^{-1} \bv \lambda\) | \(\psi + (\bv \lambda^\top \bv \Theta^{-1} \bv \lambda)^{-1}\) |
\(\lambda^*\) | \(\sum_j \lambda_j\) | \(\psi \bv \lambda^\top \bv \Sigma_X^{-1} \bv \lambda\) | 1 |
Reliability | \(\dfrac{(\sum_j \lambda_j)^2 \psi}{\bv 1^\top \bv \Sigma_X \bv 1}\) | \(\psi \bv \lambda^\top \bv \Sigma_X^{-1} \bv \lambda\) | \(\dfrac{\psi}{\psi + (\bv \lambda^\top \bv \Theta^{-1} \bv \lambda)^{-1}}\) |
ECLS-K: Achievement (Science, Reading, Math) across Grades 3, 5, and 8
A challenge of joint modeling is that the definition of latent variables can change across models
Latent Basis | No Growth | Measurement Only | |
---|---|---|---|
Science | 14.87 | 18.57 | 14.83 |
Reading | 21.47 | 28.19 | 21.39 |
Math | 20.20 | 25.93 | 20.11 |
↑ Note the loadings change across different models
With cross-loadings and/or correlated errors, scoring should be done with a joint multidimensional factor model
Mean structure
\[ \tilde{\bv \eta}_i = \bv {\color{red}b^*}_{\color{red}i} + \bv \Lambda^*_i \bv \eta^*_i + \bv \varepsilon^*_i \]
# Get factor scores from partial scalar invariance model
fs_dat <- R2spa::get_fs(eclsk, model = pscalar_mod)
# Growth model
tspa_growth_mod <- "
i =~ 1 * eta1 + 1 * eta2 + 1 * eta3
s =~ 0 * eta1 + start(.5) * eta2 + 1 * eta3
# factor error variances (assume homogeneity)
eta1 ~~ psi * eta1
eta2 ~~ psi * eta2
eta3 ~~ psi * eta3
i ~~ start(.8) * i
s ~~ start(.5) * s
i ~~ start(0) * s
i + s ~ 1
"
# Fit the growth model
tspa_growth_fit <- tspa(tspa_growth_mod, fs_dat,
fsT = attr(fs_dat, "fsT"),
fsL = attr(fs_dat, "fsL"),
fsb = attr(fs_dat, "fsb"),
estimator = "ML")
summary(tspa_growth_fit)
Parameter | Model | Est | SE | LRT \(\chi^2\) |
---|---|---|---|---|
Mean slope | JSEM | 1.873 | 0.025 | 2223.513 |
2S-PA (Reg) | 1.874 | 0.018 | 2271.428 | |
2S-PA (Bart) | 1.874 | 0.018 | 2271.428 | |
FS (Reg) | 1.874 | 0.010 | 3282.137 | |
FS (Bart) | 1.874 | 0.019 | 2248.001 | |
Var slope | JSEM | 0.099 | 0.017 | |
2S-PA (Reg) | 0.100 | 0.016 | ||
2S-PA (Bart) | 0.100 | 0.016 | ||
FS (Reg) | 0.065 | 0.004 | ||
FS (Bart) | 0.141 | 0.016 |
2S-PA treats \(\bv \Lambda^*\) and \(\bv \Theta^*\) as known
Solution 1: Bayesian estimation of factor scores (Lai and Hsiao 2022)
Solution 2: Incorporating SE of \(\bv \Lambda^*\) and \(\bv \Theta^*\) (Meijer, Oczkowski, and Wansbeek 2021)1
Tedious to do product indicators
With 2S-PA, just one product factor score indicator
With measurement error
Estimates are virtually identical to those with joint modeling
Undergraduate and Graduate students
Collaborators