On multicore systems, we suggest use of future::plan("multicore") or
future::plan("multisession") for faster post-MCMC computations.
future::plan("multicore", workers =3)
Model Equation
For response vector of length for person in cluster , the multilevel factor model is
with
Identification Constraints
For each component in , we need one constraint on the mean structure and one constraint on the covariance structure. Conventionally, for the mean structure, we either set or for one item; for the covariance structure, we either set or for one item.
Assumptions
Here we assume cross-level loading invariance (see this paper and this paper, for example), such that there is only one loading matrix that applies to both levels, so that the latent variable has both a within-level and a between-level component:
However, the loadings can still be randomly varying across clusters, which will be a discussion for Part II.
We also assume that is i.i.d. across clusters and that is i.i.d. across individuals.
Conditional Independence
In Bayesian estimation we try to find conditional independence in the data so that the estimation can be simplified. Here, assuming multivariate normality of and , we have independent observations when conditioned on the level-2 parameters:
where . Note that the within-cluster covariance is constant.
Simulate Data
Set up: One factor, Four items; two items with random intercepts
blavaan NOTE: two-level models are new, please report bugs!
https://github.com/ecmerkle/blavaan/issues
Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess
Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess
Computing post-estimation metrics (including lvs if requested)...
It takes about 32.568 seconds to estimate the model.
Using STAN
Columnwise approach
Here’s the code I used to fit multilevel factor model many years ago. It assumes that is diagonal, aka local independence, so that I can treat each item as independent normal variable (after conditioning on the parameters). It does require sampling the latent variables: , , and .
Listing 1: STAN code for the columnwise approach
//// This Stan program defines a one-factor two-level CFA model with parameter // expansion implemented in Merkle & Wang (2018). //// The input data is a matrix 'y' of 'N' rows and 'p' columns.data {int<lower=0> N; // number of observationsint<lower=0> p; // number of itemsint<lower=0> J; // number of clustersarray[N] int<lower=1, upper=J> jj; // cluster membership indicatorarray[p] vector[N] Y; // item responses}// The parameters accepted by the model. Our model// accepts these parameters: // 'lambda_star' for unscaled loadings, 'nu' for intercepts, // 'thetaw' and 'thetab' for uniqueness at the within and the between levels// 'etaw_star' and 'z_etab_star' for the standardized within-cluster and // between-cluster latent factor scores// 'z_epsb' for standardized between-cluster uniqe factor scores, and // 'psib' for latent factor SDparameters {// vector[p] lambdaw_star;// vector[p] lambdab_star;vector[p] lambda_star; // unscaled loading vector assuming cross-level // invariancevector[p] nu; // intercept vectorvector<lower=0>[p] thetaw; // within-cluster uniqueness vectorvector<lower=0>[p] thetab; // between-cluster uniqueness vectorvector[N] etaw_star; // within-cluster latent scoresvector[J] z_etab_star; // unscaled between-cluster latent scoresreal<lower=0> psib; // between-cluster latent standard deviationmatrix[J, p] z_epsb; // unscaled between-cluster uniqe factor score matrix}// The model to be estimated. We model each column in the output// 'y' to be normally distributed with mean // 'nu + lambda_star * etaw_star + Yb' and standard deviation 'thetaw', where// 'Yb' = 'etab_star * lambda_star + epsb', and 'epsb' = z_epsb * thetab'model {// lambdaw_star ~ std_normal();// lambdab_star ~ std_normal(); lambda_star ~ std_normal(); etaw_star ~ std_normal(); z_etab_star ~ std_normal(); thetaw ~ student_t(4, 0, 1); thetab ~ std_normal(); psib ~ std_normal(); to_vector(z_epsb) ~ std_normal(); for (i in1:p) {// 'Yb = etab_star * lambda_star + epsb'vector[J] Yb = z_etab_star * psib * lambda_star[i] + col(z_epsb, i) * thetab[i];// 'Y ~ N(nu + lambda_star * etaw_star + Yb, thetaw)' Y[i] ~ normal(nu[i] + etaw_star * lambda_star[i] + Yb[jj], thetaw[i]); }}generated quantities {vector[p] lambda; // final loading vectorvector[N] etaw; // final within-cluster factor scorevector[J] etab; // final between-cluster factor score {int sign_l1 = lambda_star[1] > 0 ? 1 : -1; lambda = sign_l1 * lambda_star; etaw = sign_l1 * etaw_star; etab = sign_l1 * z_etab_star * psib; }}
This is still relatively quick, but won’t work when is not diagonal.
Marginal likelihood using sufficient statistics
From citations such as this and this and this, the likelihood function for cluster is
This shows that the likelihood function has two multivariate normal components: that is multivariate normal with mean 0 and covariance with a sample size of , and that is multivariate normal with mean 0 and covariance .
As is a scalar and can be written as , the sufficient statistics in the above likelihood function is the cluster-specific sample cross-product matrix, = , and the sample cluster means, .
Tip
When the within-cluster covariance is assumed homogeneous, the likelihood function will depend only on the total within-sample cross-product matrix,
Tip
When the design is balanced such that the cluster sizes are equal, on top of homogeneous of covariance condition, the likelihood function will depend only on the between-sample cross-product matrix,
The marginal likelihood approach is also used in blavaan. Below, I explore coding the model in STAN. Note that I ignore the mean structure since it is saturated in the factor model (as in single-level analysis), such that the last term in Equation 1 is 0.
Balanced design, pooled matrices
//// This Stan program defines a one-factor two-level CFA model with a// marginal likelihood approach similar to one described in// https://ecmerkle.github.io/blavaan/articles/multilevel.html. //// Define function for computing multi-normal log density using sufficient// statisticsfunctions {real multi_normal_suff_lpdf(matrix S,matrix Sigma,int n) {return -0.5 * (n * (log_determinant_spd(Sigma) + cols(S) * log(2 * pi())) + trace(mdivide_left_spd(Sigma, S))); }}// The input data are (a) sw, pooled within-cluster cross-product matrix,// and (b) sb, cluster-specific outer-product of cluster means from the // grand means.data {int<lower=1> p; // number of itemsint<lower=1> J; // number of clustersint<lower=1> n; // cluster sizematrix[p, p] sw; // pooled within cross-product matrixmatrix[p, p] sb; // between cross-product matrix of cluster mean}// Compute within degrees of freedom = N - Jtransformed data {int dfw = J * (n - 1);}// The parameters accepted by the model. Our model// accepts four parameters: // 'lambda_star' for unscaled loadings, 'nu' for intercepts, // 'thetaw' and 'thetab' for uniqueness at the within and the between levels,// and 'psib' for latent factor SDparameters {vector[p] lambda_star; // unscaled loading vector assuming cross-level // invariancevector<lower=0>[p] thetaw; // within-cluster uniqueness vectorvector<lower=0>[p] thetab; // between-cluster uniqueness vectorreal<lower=0> psib; // between-cluster latent standard deviation}// The model to be estimated. We compute the log-likelihood as the sum of// two components: cluster-mean centered variables are multivariate normal// with covariance matrix sigmaw (and degree of freedom adjustment),// and cluster-mean variables are multivariate normal with covariance// matrix sigmab + sigmaw / njmodel {matrix[p, p] LLt = lambda_star * lambda_star';matrix[p, p] sigmaw = add_diag(LLt, square(thetaw));matrix[p, p] sigmab = add_diag(square(psib) .* LLt, square(thetab)); lambda_star ~ normal(0, 10); thetaw ~ gamma(1, .5); thetab ~ gamma(1, .5); psib ~ gamma(1, .5);target += multi_normal_suff_lpdf(sw | sigmaw, dfw);target += multi_normal_suff_lpdf(sb | n * sigmab + sigmaw, J);}// Rotating the factor loadings// (see https://discourse.mc-stan.org/t/latent-factor-loadings/1483)generated quantities {vector[p] lambda; // final loading vector {int sign_l1 = lambda_star[1] > 0 ? 1 : -1; lambda = sign_l1 * lambda_star; }}
# Prepare data for STANyc <-sweep(dat[1:4], MARGIN =2, STATS =colMeans(dat[1:4])) # centered datan <-table(dat$id)[1]# Cross-product matricesssw <-crossprod(apply(yc, MARGIN =2, FUN = \(x) x -ave(x, dat$id)))ssb <-crossprod(apply(yc, MARGIN =2, FUN = ave, dat$id))# Run STANmcfa_suff_eq_fit <- mcfa_suff_eq$sample(data =list(p =4,J = num_clus,n = n,sw = ssw,sb = ssb ),chains =3,parallel_chains =3,iter_warmup =500,iter_sampling =1000,refresh =500)
//// This Stan program defines a one-factor two-level CFA model with a// marginal likelihood approach similar to one described in// https://ecmerkle.github.io/blavaan/articles/multilevel.html. The approach// allows for unequal cluster sizes, but assumes equal loadings and latent// variances across clusters, and every observations have complete data// on all items.//// Define function for computing multi-normal log density using sufficient// statisticsfunctions {real multi_normal_suff_lpdf(matrix S,matrix Sigma,int n) {return -0.5 * (n * (log_determinant_spd(Sigma) + cols(S) * log(2 * pi())) + trace(mdivide_left_spd(Sigma, S))); }}// The input data are (a) sw, cluster-specific within-cluster cross-product// matrix, and (b) ybar, cluster-specific cluster means of the items.data {int<lower=1> p; // number of itemsint<lower=1> J; // number of clustersarray[J] int<lower=1> n; // cluster sizearray[J] matrix[p, p] sw; // within cross-product matricesarray[J] vector[p] ybar; // cluster means}// The parameters accepted by the model. Our model// accepts four parameters: // 'lambda_star' for unscaled loadings, 'nu' for intercepts, // 'thetaw' and 'thetab' for uniqueness at the within and the between levels,// and 'psib' for latent factor SDparameters {vector[p] lambda_star; // unscaled loading vector assuming cross-level // invariancevector<lower=0>[p] thetaw; // within-cluster uniqueness vectorvector<lower=0>[p] thetab; // between-cluster uniqueness vectorreal<lower=0> psib; // between-cluster latent standard deviationvector[p] nu; // item means/intercepts}// The model to be estimated. We compute the log-likelihood as the sum of// two components: cluster-mean centered variables are multivariate normal// with covariance matrix sigmaw (and degree of freedom adjustment),// and cluster-mean variables are multivariate normal with covariance// matrix sigmab + sigmaw / njmodel {matrix[p, p] LLt = lambda_star * lambda_star';matrix[p, p] sigmaw = add_diag(LLt, square(thetaw));matrix[p, p] sigmab = add_diag(square(psib) .* LLt, square(thetab)); lambda_star ~ normal(0, 10); thetaw ~ gamma(1, .5); thetab ~ gamma(1, .5); psib ~ gamma(1, .5); nu ~ normal(0, 32);for (j in1:J) {target += multi_normal_suff_lpdf(sw[j] | sigmaw, n[j] - 1); ybar[j] ~ multi_normal(nu, sigmab + sigmaw / n[j]); }}// Rotating the factor loadings// (see https://discourse.mc-stan.org/t/latent-factor-loadings/1483)generated quantities {vector[p] lambda; // final loading vector {int sign_l1 = lambda_star[1] > 0 ? 1 : -1; lambda = sign_l1 * lambda_star; }}
# Prepare data for STANnj <-table(dat$id)# Cluster-specific cross-product matricessswj <-tapply(dat[1:4], INDEX = dat$id,FUN = \(x) tcrossprod(t(x) -colMeans(x)))ybarj <-tapply(dat[1:4], INDEX = dat$id, FUN = \(x) colMeans(x))# ssbj <- tapply(yc, INDEX = dat$id,# FUN = \(x) nrow(x) * tcrossprod(colMeans(x)))# Run STANmcfa_suffj_fit <- mcfa_suffj$sample(data =list(p =4,J = num_clus,n = nj,sw = sswj,ybar = ybarj ),chains =3,parallel_chains =3,iter_warmup =500,iter_sampling =1000,refresh =500)