library(lavaan)
library(MplusAutomation)
library(posterior)
library(cmdstanr)
register_knitr_engine()
Bayesian Multilevel Factor Model (Part II)
Comparing different approaches for fitting multilevel factor models with random intercepts and latent variances
In Part I, I explore how to fit a traditional multilevel factor model with STAN. That model, which is one used in > 90% of the literature, makes a very strong assumption that I believe is rarely discussed: that it assumes homogeneous variance of the latent variable across clusters. In Part II, I explore how such an assumption can be relaxed, and what its impact is when ignored.
Simulate Data
Same set up: One-factor four items; two items with random intercepts
However, the SD of the latent variable forms a uniform distribution in the range [0.2, 1.8] and varies across clusters. Also, we arrange the data such that clusters with higher mean \(\eta\) also tend to have lower SD of \(\eta\).
# Design parameters
set.seed(123)
<- 500
num_clus <- 10
clus_size <- .10
icc_eta <- c(.7, .9, .7, .8)
lambda <- c(0, 0.3, -0.5, 0.2)
nu <- .80
range_dpsiw <- c(0, 0, 0.1, 0.05) # intercept variances
theta_nu <- c(.51, .51, .40, .40) # uniqueness
theta # assume one-factor model holds for now
# Simulate latent variable
<- rep(1:num_clus, each = clus_size)
clus_id <- rnorm(num_clus, sd = sqrt(icc_eta / (1 - icc_eta)))
etab <- 1 + seq(-range_dpsiw, range_dpsiw, length.out = num_clus)
psiw # Arrange etaw and psiw so that they're negatively correlated
<- sort(etab)[c(
etab sample((num_clus / 2 + 1):num_clus),
sample(seq_len(num_clus / 2)))
(
)]<- rnorm(num_clus * clus_size, sd = psiw[clus_id])
etaw <- etab[clus_id] + etaw
eta # confirm the simulated variable behaves as expected
# lmer(eta ~ (1 | clus_id))
# Simulate items for each cluster
<- 4
num_items <- lapply(1:num_clus, FUN = \(j) {
y <- nu + rnorm(nu, sd = sqrt(theta_nu)) +
yj tcrossprod(
# + rnorm(lambda, sd = sqrt(theta_lambda)),
lambda, + etaw[clus_id == j]
etab[j] +
) rnorm(num_items * clus_size, sd = sqrt(theta))
t(yj)
})<- do.call(rbind, y) |>
dat as.data.frame() |>
setNames(paste0("y", 1:4)) |>
cbind(id = clus_id)
Frequentist with lavaan
(assuming homogeneous variance)
<- "
mcfa_mod level: 1
f =~ l1 * y1 + l2 * y2 + l3 * y3 + l4 * y4
f ~~ 1 * f
level: 2
f =~ l1 * y1 + l2 * y2 + l3 * y3 + l4 * y4
"
<- cfa(mcfa_mod, data = dat, cluster = "id", auto.fix.first = FALSE) mcfa_fit
Warning: lavaan->lav_object_post_check():
some estimated ov variances are negative
summary(mcfa_fit)
lavaan 0.6-19 ended normally after 31 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 21
Number of equality constraints 4
Number of observations 5000
Number of clusters [id] 500
Model Test User Model:
Test statistic 0.737
Degrees of freedom 7
P-value (Chi-square) 0.998
Parameter Estimates:
Standard errors Standard
Information Observed
Observed information based on Hessian
Level 1 [within]:
Latent Variables:
Estimate Std.Err z-value P(>|z|)
f =~
y1 (l1) 0.760 0.014 56.267 0.000
y2 (l2) 0.982 0.015 64.482 0.000
y3 (l3) 0.774 0.013 59.934 0.000
y4 (l4) 0.867 0.014 62.671 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
f 1.000
.y1 0.512 0.013 39.022 0.000
.y2 0.491 0.015 32.610 0.000
.y3 0.379 0.011 35.328 0.000
.y4 0.419 0.012 33.777 0.000
Level 2 [id]:
Latent Variables:
Estimate Std.Err z-value P(>|z|)
f =~
y1 (l1) 0.760 0.014 56.267 0.000
y2 (l2) 0.982 0.015 64.482 0.000
y3 (l3) 0.774 0.013 59.934 0.000
y4 (l4) 0.867 0.014 62.671 0.000
Intercepts:
Estimate Std.Err z-value P(>|z|)
.y1 0.004 0.018 0.214 0.830
.y2 0.294 0.022 13.531 0.000
.y3 -0.471 0.023 -20.428 0.000
.y4 0.186 0.021 8.660 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.y1 -0.007 0.004 -1.677 0.094
.y2 -0.003 0.005 -0.649 0.516
.y3 0.110 0.010 10.755 0.000
.y4 0.040 0.006 6.163 0.000
f 0.097 0.015 6.597 0.000
From the results, it appears that the factor loadings are inflated when ignoring the random latent variance.
Bayesian using Mplus
We can fit a Bayesian model with random latent variance using Mplus, which uses Gibbs sampling and conjugate priors. Specifically, for the latent variable \(\eta_j\) in cluster \(j\),
\[ \eta_j \sim N(\eta^b_j, \psi^w_j), \]
and we model \((\eta^b_j, \log[\psi^w_j])\) as jointly normal
\[ \begin{pmatrix} \eta^b_j \\ \log[\psi^w_j] \end{pmatrix} \sim N\left( \begin{bmatrix} \alpha_1 \\ \alpha_2 \end{bmatrix}, \begin{bmatrix} \psi^b_1 & \\ \psi^b_{12} & \psi^b_2 \end{bmatrix} \right) \tag{1}\]
Once again, we need one constraint on the mean structure and one constraint on the covariance structure. Here, we can conveniently set \(\alpha_1\) = \(\alpha_2\) = 0 so that \(E(\eta) = 0\) and \(E[\log(\psi^w_j)] = 0\) (and when log variance = 0, variance = 1).
# Save data for Mplus
write.table(dat, file = "mcfa.dat", row.names = FALSE, col.names = FALSE)
writeLines("
TITLE: Bayesian 1-factor model with random variance;
DATA: FILE = mcfa.dat;
VARIABLE:
NAMES = y1-y4 id;
USEVAR = y1-y4;
CLUSTER = id;
ANALYSIS:
TYPE = TWOLEVEL RANDOM;
ESTIMATOR = BAYES; ! This is default
BITERATION = 500000 (10000);
BCONVERGENCE = .005; ! ESS of ~ 100
CHAINS = 3;
PROCESS = 3;
MODEL: %WITHIN%
fw BY y1-y4* (l1-l4);
logpsiw | fw;
%BETWEEN%
fb BY y1-y4* (l1-l4);
logpsiw with fb; ! psi_12
fb; logpsiw; ! psi_1 and psi_2
[logpsiw@0]; ! for setting the scale
y1-y4;
OUTPUT: TECH1 TECH8;
SAVEDATA:
BPARAMETERS = mcfa_random_var_draws.dat;
", con = "mcfa_random_var.inp")
<- readModels("mcfa_random_var.out")$bparameters[[2]] # discard burn-in
mplus_draws # Drop chain and iteration columns
1]] <- mplus_draws[[1]][, -c(1, 2)]
mplus_draws[[2]] <- mplus_draws[[2]][, -c(1, 2)]
mplus_draws[[3]] <- mplus_draws[[3]][, -c(1, 2)]
mplus_draws[[<- posterior::as_draws_list(mplus_draws)
mplus_draws |>
mplus_draws ::summarise_draws() |>
posterior::kable(digits = 2) knitr
variable | mean | median | sd | mad | q5 | q95 | rhat | ess_bulk | ess_tail |
---|---|---|---|---|---|---|---|---|---|
Parameter.1_%WITHIN%:.FW.BY.Y1.(equality/label) | 0.61 | 0.61 | 0.02 | 0.02 | 0.58 | 0.64 | 1.04 | 158.87 | 395.22 |
Parameter.2_%WITHIN%:.FW.BY.Y2.(equality/label) | 0.78 | 0.78 | 0.02 | 0.02 | 0.74 | 0.82 | 1.05 | 148.31 | 243.70 |
Parameter.3_%WITHIN%:.FW.BY.Y3.(equality/label) | 0.61 | 0.61 | 0.02 | 0.02 | 0.58 | 0.64 | 1.04 | 135.09 | 306.24 |
Parameter.4_%WITHIN%:.FW.BY.Y4.(equality/label) | 0.69 | 0.69 | 0.02 | 0.02 | 0.66 | 0.73 | 1.04 | 162.75 | 316.70 |
Parameter.5_%WITHIN%:.Y1 | 0.50 | 0.50 | 0.01 | 0.01 | 0.48 | 0.52 | 1.00 | 9178.99 | 20794.23 |
Parameter.6_%WITHIN%:.Y2 | 0.49 | 0.49 | 0.01 | 0.01 | 0.47 | 0.52 | 1.00 | 5870.86 | 13847.52 |
Parameter.7_%WITHIN%:.Y3 | 0.38 | 0.38 | 0.01 | 0.01 | 0.36 | 0.40 | 1.00 | 9548.73 | 18329.28 |
Parameter.8_%WITHIN%:.Y4 | 0.41 | 0.41 | 0.01 | 0.01 | 0.39 | 0.43 | 1.00 | 7283.10 | 14730.57 |
Parameter.9_%BETWEEN%:.MEAN.Y1 | 0.01 | 0.01 | 0.02 | 0.02 | -0.02 | 0.04 | 1.02 | 76.65 | 99.72 |
Parameter.10_%BETWEEN%:.MEAN.Y2 | 0.30 | 0.30 | 0.02 | 0.02 | 0.27 | 0.33 | 1.03 | 75.07 | 174.96 |
Parameter.11_%BETWEEN%:.MEAN.Y3 | -0.46 | -0.46 | 0.02 | 0.02 | -0.50 | -0.43 | 1.00 | 199.26 | 602.42 |
Parameter.12_%BETWEEN%:.MEAN.Y4 | 0.19 | 0.19 | 0.02 | 0.02 | 0.16 | 0.23 | 1.01 | 123.90 | 323.58 |
Parameter.13_%BETWEEN%:.Y1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 1.01 | 90.05 | 152.61 |
Parameter.14_%BETWEEN%:.Y2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 1.03 | 47.44 | 69.72 |
Parameter.15_%BETWEEN%:.Y3 | 0.11 | 0.11 | 0.01 | 0.01 | 0.09 | 0.13 | 1.00 | 8387.99 | 14736.77 |
Parameter.16_%BETWEEN%:.Y4 | 0.04 | 0.04 | 0.01 | 0.01 | 0.03 | 0.05 | 1.00 | 2281.47 | 3623.19 |
Parameter.17_%BETWEEN%:.FB | 0.15 | 0.15 | 0.02 | 0.02 | 0.11 | 0.19 | 1.02 | 448.34 | 1637.46 |
Parameter.18_%BETWEEN%:.LOGPSIW.WITH.FB | -0.24 | -0.24 | 0.03 | 0.03 | -0.30 | -0.18 | 1.00 | 764.27 | 3103.02 |
Parameter.19_%BETWEEN%:.LOGPSIW | 1.10 | 1.09 | 0.11 | 0.11 | 0.93 | 1.28 | 1.01 | 825.07 | 2752.46 |
Using STAN
Trial 1
In the first trial, I sample both \(\eta^b_j\) and \(\log(\psi^w_j)\), and use the conditional distributions:
\[ \begin{aligned} \bar{\boldsymbol{\mathbf{y}}}_j \mid \eta^b_j, \log(\psi^w_j) & \sim N(\boldsymbol{\mathbf{\nu }}+ \boldsymbol{\mathbf{\Lambda }}\eta^b_j, \boldsymbol{\mathbf{\Sigma}}_{Wj} + \boldsymbol{\mathbf{\Theta}}^b) \\ \boldsymbol{\mathbf{y}}_{ij} - \bar{\boldsymbol{\mathbf{y}}}_j \mid \eta^b_j, \log(\psi^w_j) & \sim N(\boldsymbol{\mathbf{0}}, \boldsymbol{\mathbf{\Sigma}}_{Wj}) \quad \text{for } i = 1, \ldots, n_j - 1 \end{aligned} \]
//
// This Stan program defines a one-factor two-level CFA model with a
// marginal likelihood approach similar to one described in
// https://ecmerkle.github.io/blavaan/articles/multilevel.html.
//
// Define function for computing multi-normal log density using sufficient
// statistics
functions {
real multi_normal_suff_lpdf(matrix S,
matrix Sigma,
int n) {
return -0.5 * (n * (log_determinant_spd(Sigma) + cols(S) * log(2 * pi())) +
trace(mdivide_left_spd(Sigma, S)));
}
}// The input data are (a) sw, cluster-specific within-cluster cross-product
// matrix, and (b) sb, cluster-specific outer-product of cluster means from
// the grand means.
data {
int<lower=1> p; // number of items
int<lower=1> J; // number of clusters
array[J] int<lower=1> n; // cluster size
array[J] matrix[p, p] sw; // within cross-product matrices
array[J] vector[p] ybar; // cluster means
}// The parameters accepted by the model. Our model
// accepts four parameters:
// 'lambda_star' for unscaled loadings, 'nu' for intercepts,
// 'thetaw' and 'thetab' for uniqueness at the within and the between levels,
// and 'psib' for latent factor SD
parameters {
vector[p] lambda_star; // unscaled loading vector assuming cross-level
// invariance
vector<lower=0>[p] thetaw; // within-cluster uniqueness vector
vector<lower=0>[p] thetab; // between-cluster uniqueness vector
cholesky_factor_corr[2] L_phi; // cholesky of correlation matrix of (logvar, mean)
vector<lower=0>[2] sd_phi; // standard deviations of (logvar, mean)
matrix[2, J] z_xi; // scaled values of (logvar, mean)
// real<lower=0> sd_logpsiw; // SD of log(SD) of etaw
// real<lower=0> psib; // between-cluster latent standard deviation
// vector[J] z_logpsiw; // scaled value of log(SD) of etaw
vector[p] nu; // item means/intercepts
}
// The model to be estimated. We compute the log-likelihood as the sum of
// two components: cluster-mean centered variables are multivariate normal
// with covariance matrix sigmaw (and degree of freedom adjustment),
// and cluster-mean variables are multivariate normal with covariance
// matrix sigmab + sigmaw / nj
model {
matrix[p, p] sigmawj;
matrix[2, J] xi = diag_pre_multiply(sd_phi, L_phi) * z_xi;
matrix[p, J] mu = lambda_star * xi[2, ];
matrix[p, p] LLt = lambda_star * lambda_star';
// matrix[p, p] sigmab = add_diag(square(psib) .* LLt, square(thetab));
lambda_star ~ std_normal();4, 0, 1);
thetaw ~ student_t(
thetab ~ std_normal();
to_vector(z_xi) ~ std_normal();4, 0, 1);
sd_phi ~ student_t(2);
L_phi ~ lkj_corr_cholesky(for (j in 1:J) {
2 * xi[1, j]) .* LLt, square(thetaw));
sigmawj = add_diag(exp(target += multi_normal_suff_lpdf(sw[j] | sigmawj, n[j] - 1);
ybar[j] ~ multi_normal(nu + mu[, j], add_diag(sigmawj / n[j], square(thetab)));
}
}// Rotating the factor loadings
// (see https://discourse.mc-stan.org/t/latent-factor-loadings/1483)
generated quantities {
vector[p] lambda; // final loading vector
real cor_logpsiw_etab;
{int sign_l1 = lambda_star[1] > 0 ? 1 : -1;
lambda = sign_l1 * lambda_star;2, 1];
cor_logpsiw_etab = sign_l1 * L_phi[
} }
# Prepare data for STAN
<- sweep(dat[1:4], MARGIN = 2, STATS = colMeans(dat[1:4])) # centered data
yc <- table(dat$id)
nj # Sufficient statistics
<- tapply(yc, INDEX = dat$id,
sswj FUN = \(x) tcrossprod(t(x) - colMeans(x)))
<- tapply(dat[1:4], INDEX = dat$id, FUN = \(x) colMeans(x)) ybarj
# Run STAN
<- mcfa_randvar$sample(
mcfa_randvar_fit data = list(
p = 4,
J = num_clus,
n = nj,
sw = sswj,
ybar = ybarj
),chains = 3,
parallel_chains = 3,
iter_warmup = 500,
iter_sampling = 1000,
refresh = 500
)
$summary(
mcfa_randvar_fitc("lambda", "thetaw", "thetab", "sd_phi", "cor_logpsiw_etab")
|>
) ::kable(digits = 2) knitr
variable | mean | median | sd | mad | q5 | q95 | rhat | ess_bulk | ess_tail |
---|---|---|---|---|---|---|---|---|---|
lambda[1] | 0.61 | 0.61 | 0.02 | 0.02 | 0.58 | 0.64 | 1.01 | 416.82 | 951.89 |
lambda[2] | 0.78 | 0.78 | 0.02 | 0.02 | 0.75 | 0.82 | 1.01 | 372.45 | 688.70 |
lambda[3] | 0.62 | 0.62 | 0.02 | 0.02 | 0.59 | 0.65 | 1.01 | 388.12 | 627.81 |
lambda[4] | 0.70 | 0.70 | 0.02 | 0.02 | 0.66 | 0.73 | 1.01 | 411.88 | 716.34 |
thetaw[1] | 0.71 | 0.71 | 0.01 | 0.01 | 0.69 | 0.72 | 1.00 | 3409.43 | 2070.68 |
thetaw[2] | 0.70 | 0.70 | 0.01 | 0.01 | 0.69 | 0.72 | 1.00 | 2969.34 | 1962.94 |
thetaw[3] | 0.62 | 0.62 | 0.01 | 0.01 | 0.60 | 0.63 | 1.00 | 2880.06 | 1899.37 |
thetaw[4] | 0.64 | 0.64 | 0.01 | 0.01 | 0.63 | 0.66 | 1.00 | 3217.95 | 2394.14 |
thetab[1] | 0.03 | 0.02 | 0.02 | 0.02 | 0.00 | 0.06 | 1.00 | 1792.84 | 1053.89 |
thetab[2] | 0.03 | 0.03 | 0.02 | 0.02 | 0.00 | 0.08 | 1.00 | 1627.72 | 907.86 |
thetab[3] | 0.33 | 0.33 | 0.02 | 0.02 | 0.30 | 0.36 | 1.00 | 3307.89 | 2123.94 |
thetab[4] | 0.20 | 0.20 | 0.02 | 0.02 | 0.17 | 0.22 | 1.00 | 3168.89 | 2084.24 |
sd_phi[1] | 0.52 | 0.52 | 0.03 | 0.03 | 0.48 | 0.56 | 1.00 | 633.21 | 1289.95 |
sd_phi[2] | 0.38 | 0.38 | 0.03 | 0.03 | 0.33 | 0.42 | 1.00 | 834.20 | 1498.27 |
cor_logpsiw_etab | -0.60 | -0.60 | 0.06 | 0.07 | -0.70 | -0.49 | 1.01 | 586.29 | 1360.98 |
Trial 2
The previous approach requires conditioning on two latent variables: \(\eta^b_j\) and \(\log(\psi^w_j)\). However, for normal models, we can reduce the dimensionality using a trick introduced by Du Toit and Cudeck (2009) for normal models. First, we decompose the covariance matrix in Equation 1 into the standard deviations and the correlation matrix (see https://mc-stan.org/docs/stan-users-guide/regression.html#multivariate-regression-example):
\[ V\begin{pmatrix} \eta^b_j \\ \log[\psi^w_j] \end{pmatrix} = \mathop{\mathrm{diag}}(\boldsymbol{\mathbf{\sigma}}_\phi) \boldsymbol{\mathbf{\Omega}}_\phi \mathop{\mathrm{diag}}(\boldsymbol{\mathbf{\sigma}}_\phi) = \begin{bmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{bmatrix} \begin{bmatrix} 1 & \rho \\ \rho & 1 \end{bmatrix} \begin{bmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{bmatrix}. \]
Further, for positive definite \(\boldsymbol{\mathbf{\Omega}}_\phi\), we can use the Cholesky decomposition:
\[ \boldsymbol{\mathbf{\Omega}}_\phi = \boldsymbol{\mathbf{L}}_\phi \boldsymbol{\mathbf{L}}_\phi^\top = \begin{bmatrix} 1 & 0 \\ \rho & \sqrt{1 - \rho^2} \end{bmatrix} \begin{bmatrix} 1 & \rho \\ 0 & \sqrt{1 - \rho^2} \end{bmatrix} \]
Also, the conditional distribution of \(\eta^b_j \mid \psi^w_j\) is \(N(\alpha_{\mid \psi^w}), \psi^b_{\mid \psi^w}\), where
\[ \begin{aligned} \alpha_{\mid \psi^w} & = \alpha_1 + \rho \sigma_1 Z_{\log(\psi^w_j)} \\ \psi^b_{\mid \psi^w} & = \sigma^2_1 [1 - \rho^2], \end{aligned} \]
One can thus first sample \(\log(\psi^w_j)\) from a univariate normal distribution, and then update \(\alpha_{\mid \psi^w}\) and \(\psi^b_{\mid \psi^w}\) to derive \(\boldsymbol{\mathbf{\Sigma}}^w_j\), \(\boldsymbol{\mathbf{\mu}}_j\) (implied item means), and \(\boldsymbol{\mathbf{\Sigma}}^b\), conditioned on \(\psi^w_j\), as shown in the following implementation.
//
// This Stan program defines a one-factor two-level CFA model with a
// marginal likelihood approach similar to one described in
// https://ecmerkle.github.io/blavaan/articles/multilevel.html.
//
// Define function for computing multi-normal log density using sufficient
// statistics
functions {
real multi_normal_suff_lpdf(matrix S,
matrix Sigma,
int n) {
return -0.5 * (n * (log_determinant_spd(Sigma) + cols(S) * log(2 * pi())) +
trace(mdivide_left_spd(Sigma, S)));
}
}// The input data are (a) sw, cluster-specific within-cluster cross-product
// matrix, and (b) sb, cluster-specific outer-product of cluster means from
// the grand means.
data {
int<lower=1> p; // number of items
int<lower=1> J; // number of clusters
array[J] int<lower=1> n; // cluster size
array[J] matrix[p, p] sw; // within cross-product matrices
array[J] vector[p] ybar; // cluster means
}// The parameters accepted by the model. Our model
// accepts four parameters:
// 'lambda_star' for unscaled loadings, 'nu' for intercepts,
// 'thetaw' and 'thetab' for uniqueness at the within and the between levels,
// and 'psib' for latent factor SD
parameters {
vector[p] lambda_star; // unscaled loading vector assuming cross-level
// invariance
vector<lower=0>[p] thetaw; // within-cluster uniqueness vector
vector<lower=0>[p] thetab; // between-cluster uniqueness vector
cholesky_factor_corr[2] Lphi; // cholesky of correlation matrix of (logvar, mean)
real<lower=0> sd_logpsiw; // SD of log(SD) of etaw
real<lower=0> psib; // between-cluster latent standard deviation
row_vector[J] z_logpsiw; // scaled value of log(SD) of etaw
vector[p] nu; // item means/intercepts
}
// The model to be estimated. We compute the log-likelihood as the sum of
// two components: cluster-mean centered variables are multivariate normal
// with covariance matrix sigmaw (and degree of freedom adjustment),
// and cluster-mean variables are multivariate normal with covariance
// matrix sigmab + sigmaw / nj
model {
matrix[p, J] mu = lambda_star * ((psib * Lphi[2, 1]) .* z_logpsiw);
row_vector[J] psiw = exp(sd_logpsiw .* z_logpsiw);
matrix[p, p] LLt = lambda_star * lambda_star';
matrix[p, p] sigmab = add_diag(square(psib * Lphi[2, 2]) .* LLt, square(thetab));
0, 10);
lambda_star ~ normal(1, .5);
thetaw ~ gamma(1, .5);
thetab ~ gamma(1, .5);
psib ~ gamma(0, 32);
nu ~ normal(
z_logpsiw ~ std_normal();1, 0.5);
sd_logpsiw ~ gamma(2);
Lphi ~ lkj_corr_cholesky(for (j in 1:J) {
matrix[p, p] sigmawj = add_diag(square(psiw[j]) .* LLt, square(thetaw));
target += multi_normal_suff_lpdf(sw[j] | sigmawj, n[j] - 1);
ybar[j] ~ multi_normal(nu + mu[, j], sigmab + sigmawj / n[j]);
}
}// Rotating the factor loadings
// (see https://discourse.mc-stan.org/t/latent-factor-loadings/1483)
generated quantities {
vector[p] lambda; // final loading vector
real cor_logpsiw_etab;
{int sign_l1 = lambda_star[1] > 0 ? 1 : -1;
lambda = sign_l1 * lambda_star;2, 1];
cor_logpsiw_etab = sign_l1 * Lphi[
} }
# Run STAN
<- mcfa_randvar2$sample(
mcfa_randvar2_fit data = list(
p = 4,
J = num_clus,
n = nj,
sw = sswj,
ybar = ybarj
),chains = 3,
parallel_chains = 3,
iter_warmup = 500,
iter_sampling = 1000,
refresh = 500
)
$summary(
mcfa_randvar2_fitc("lambda", "thetaw", "thetab", "sd_logpsiw", "psib", "cor_logpsiw_etab")
|>
) ::kable(digits = 2) knitr
variable | mean | median | sd | mad | q5 | q95 | rhat | ess_bulk | ess_tail |
---|---|---|---|---|---|---|---|---|---|
lambda[1] | 0.61 | 0.61 | 0.02 | 0.02 | 0.58 | 0.64 | 1 | 675.59 | 1318.93 |
lambda[2] | 0.78 | 0.78 | 0.02 | 0.02 | 0.74 | 0.82 | 1 | 624.06 | 1247.15 |
lambda[3] | 0.62 | 0.62 | 0.02 | 0.02 | 0.59 | 0.65 | 1 | 635.16 | 1158.55 |
lambda[4] | 0.70 | 0.70 | 0.02 | 0.02 | 0.66 | 0.73 | 1 | 603.41 | 1079.80 |
thetaw[1] | 0.71 | 0.71 | 0.01 | 0.01 | 0.70 | 0.72 | 1 | 5695.97 | 1845.04 |
thetaw[2] | 0.70 | 0.70 | 0.01 | 0.01 | 0.69 | 0.72 | 1 | 4627.07 | 2296.33 |
thetaw[3] | 0.62 | 0.62 | 0.01 | 0.01 | 0.60 | 0.63 | 1 | 4026.94 | 1949.65 |
thetaw[4] | 0.64 | 0.64 | 0.01 | 0.01 | 0.63 | 0.66 | 1 | 4727.75 | 2143.58 |
thetab[1] | 0.03 | 0.02 | 0.02 | 0.02 | 0.00 | 0.06 | 1 | 2165.05 | 1508.75 |
thetab[2] | 0.03 | 0.03 | 0.02 | 0.02 | 0.00 | 0.08 | 1 | 2245.10 | 1644.61 |
thetab[3] | 0.33 | 0.33 | 0.02 | 0.02 | 0.31 | 0.36 | 1 | 4212.69 | 1915.84 |
thetab[4] | 0.20 | 0.20 | 0.02 | 0.02 | 0.17 | 0.22 | 1 | 4267.48 | 2346.73 |
sd_logpsiw | 0.52 | 0.52 | 0.03 | 0.03 | 0.48 | 0.56 | 1 | 979.93 | 1723.84 |
psib | 0.38 | 0.38 | 0.03 | 0.03 | 0.33 | 0.42 | 1 | 1672.81 | 2044.86 |
cor_logpsiw_etab | -0.60 | -0.60 | 0.06 | 0.06 | -0.70 | -0.49 | 1 | 1715.41 | 2175.00 |
The second approach is faster, and seems more efficient than the Mplus approach. It is unclear why the estimates of the variance/standard deviation of \(\log(\psi^w_j)\) is larger with the Mplus approach.
data.frame(
model = c("Mplus (Gibbs)", "Sampling both $\\eta^b$ and $\\log(\\psi^w)$", "Sampling only $\\log(\\psi^w)$"),
ess = c(
::summarise_draws(mplus_draws, "ess_bulk")[[2]][17],
posterior$summary(variables = "sd_phi[2]", posterior::ess_bulk)[[2]],
mcfa_randvar_fit$summary(variables = "psib", posterior::ess_bulk)[[2]]
mcfa_randvar2_fit
),time = c(
mplus_sec,$time()$total,
mcfa_randvar_fit$time()$total
mcfa_randvar2_fit
)|>
) ::kable(digits = 2, col.names = c("Model", "ESS (Bulk) for $\\psi^b$", "Time (seconds)")) knitr
Model | ESS (Bulk) for \(\psi^b\) | Time (seconds) |
---|---|---|
Mplus (Gibbs) | 448.34 | 171.00 |
Sampling both \(\eta^b\) and \(\log(\psi^w)\) | 834.20 | 229.72 |
Sampling only \(\log(\psi^w)\) | 1672.81 | 170.23 |
::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.4.1 (2024-06-14)
os Ubuntu 24.04.1 LTS
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Los_Angeles
date 2024-12-07
pandoc 3.3 @ /usr/bin/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
abind 1.4-5 2016-07-21 [3] CRAN (R 4.2.0)
backports 1.4.1 2021-12-13 [3] CRAN (R 4.2.0)
boot 1.3-30 2024-02-26 [4] CRAN (R 4.3.3)
checkmate 2.3.1 2023-12-04 [3] CRAN (R 4.3.2)
cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.1)
cmdstanr * 0.8.1 2024-09-01 [1] https://stan-dev.r-universe.dev (R 4.4.1)
coda 0.19-4.1 2024-01-31 [3] CRAN (R 4.3.2)
colorspace 2.1-0 2023-01-23 [3] CRAN (R 4.2.2)
data.table 1.15.0 2024-01-30 [3] CRAN (R 4.3.2)
digest 0.6.37 2024-08-19 [1] CRAN (R 4.4.1)
distributional 0.4.0 2024-02-07 [3] CRAN (R 4.3.2)
dplyr 1.1.4 2023-11-17 [3] CRAN (R 4.3.2)
evaluate 0.23 2023-11-01 [3] CRAN (R 4.3.2)
fansi 1.0.6 2023-12-08 [3] CRAN (R 4.3.2)
fastDummies 1.7.3 2023-07-06 [3] CRAN (R 4.3.2)
fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.1)
generics 0.1.3 2022-07-05 [3] CRAN (R 4.2.1)
ggplot2 3.4.4 2023-10-12 [3] CRAN (R 4.3.1)
glue 1.8.0 2024-09-30 [1] CRAN (R 4.4.1)
gsubfn 0.7 2018-03-16 [3] CRAN (R 4.1.1)
gtable 0.3.4 2023-08-21 [3] CRAN (R 4.3.1)
htmltools 0.5.7 2023-11-03 [3] CRAN (R 4.3.2)
htmlwidgets 1.6.4 2023-12-06 [3] CRAN (R 4.3.2)
httr 1.4.7 2023-08-15 [3] CRAN (R 4.3.1)
jsonlite 1.8.8 2023-12-04 [3] CRAN (R 4.3.2)
knitr 1.45 2023-10-30 [3] CRAN (R 4.3.2)
lattice 0.22-5 2023-10-24 [4] CRAN (R 4.3.1)
lavaan * 0.6-19 2024-09-26 [1] CRAN (R 4.4.1)
lifecycle 1.0.4 2023-11-07 [3] CRAN (R 4.3.2)
magrittr 2.0.3 2022-03-30 [3] CRAN (R 4.2.0)
MASS 7.3-61 2024-06-13 [4] CRAN (R 4.4.1)
matrixStats 1.2.0 2023-12-11 [3] CRAN (R 4.3.2)
mnormt 2.1.1 2022-09-26 [3] CRAN (R 4.2.2)
MplusAutomation * 1.1.1 2024-01-30 [3] CRAN (R 4.3.2)
munsell 0.5.0 2018-06-12 [3] CRAN (R 4.0.1)
pander 0.6.5 2022-03-18 [3] CRAN (R 4.3.0)
pbivnorm 0.6.0 2015-01-23 [3] CRAN (R 4.0.2)
pillar 1.9.0 2023-03-22 [3] CRAN (R 4.3.0)
pkgconfig 2.0.3 2019-09-22 [3] CRAN (R 4.0.1)
plyr 1.8.9 2023-10-02 [3] CRAN (R 4.3.1)
posterior * 1.6.0 2024-07-03 [1] CRAN (R 4.4.1)
processx 3.8.3 2023-12-10 [3] CRAN (R 4.3.2)
proto 1.0.0 2016-10-29 [3] CRAN (R 4.0.1)
ps 1.7.6 2024-01-18 [3] CRAN (R 4.3.2)
quadprog 1.5-8 2019-11-20 [3] CRAN (R 4.2.0)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.1)
Rcpp 1.0.13-1 2024-11-02 [1] CRAN (R 4.4.1)
rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.1)
rmarkdown 2.25 2023-09-18 [3] CRAN (R 4.3.2)
scales 1.3.0 2023-11-28 [3] CRAN (R 4.3.2)
sessioninfo 1.2.2 2021-12-06 [3] CRAN (R 4.2.0)
tensorA 0.36.2.1 2023-12-13 [3] CRAN (R 4.3.2)
texreg 1.39.3 2023-11-10 [3] CRAN (R 4.3.2)
tibble 3.2.1 2023-03-20 [3] CRAN (R 4.3.1)
tidyselect 1.2.0 2022-10-10 [3] CRAN (R 4.2.2)
utf8 1.2.4 2023-10-22 [3] CRAN (R 4.3.2)
vctrs 0.6.5 2023-12-01 [3] CRAN (R 4.3.2)
withr 3.0.0 2024-01-16 [3] CRAN (R 4.3.2)
xfun 0.41 2023-11-01 [3] CRAN (R 4.3.2)
xtable 1.8-4 2019-04-21 [3] CRAN (R 4.0.1)
yaml 2.3.8 2023-12-11 [3] CRAN (R 4.3.2)
[1] /home/markl/R/x86_64-pc-linux-gnu-library/4.4
[2] /usr/local/lib/R/site-library
[3] /usr/lib/R/site-library
[4] /usr/lib/R/library
──────────────────────────────────────────────────────────────────────────────