Reason: Cost-effective and convenient
However, it leads to nested data with (usually) unequal selection probabilities
Reason: Cost-effective and convenient
However, it leads to nested data with (usually) unequal selection probabilities
Example: 2000 PISA (US data; OECD, 2000)
Reason: Cost-effective and convenient
However, it leads to nested data with (usually) unequal selection probabilities
Example: 2000 PISA (US data; OECD, 2000)
Decompose variance into between-cluster and within-cluster components
Model varying intercepts and slopes across clusters
Decompose variance into between-cluster and within-cluster components
Model varying intercepts and slopes across clusters
Decompose variance into between-cluster and within-cluster components
Model varying intercepts and slopes across clusters
Multilevel pseudo maximum likelihood (MPML)
Maximize the weighted likelihood function
Standard errors obtained with the sandwich estimator
Maximize the weighted likelihood function
Standard errors obtained with the sandwich estimator
Large sample (both within-cluster and between-cluster)
Maximize the weighted likelihood function
Standard errors obtained with the sandwich estimator
Large sample (both within-cluster and between-cluster)
Distributional assumptions
Parametric (functional form + distribution)
Residual (functional form)
Case (only conditional independence)
Parametric (functional form + distribution)
Residual (functional form)
Case (only conditional independence)
They are implemented in the bootmlm
package (https://github.com/marklhc/bootmlm)
Parametric (functional form + distribution)
Residual (functional form)
Case (only conditional independence)
They are implemented in the bootmlm
package (https://github.com/marklhc/bootmlm)
The multilevel residual bootstrap has a good balance of robustness to assumption violations and efficiency (e.g., Lai, 2020)
Builds on previous work
Pseudopopulation (Wang & Thompson, 2012)
Resampling and rescaling of weights (Kovacevic et al. 2006)
Builds on previous work
Pseudopopulation (Wang & Thompson, 2012)
Resampling and rescaling of weights (Kovacevic et al. 2006)
Builds on previous work
Pseudopopulation (Wang & Thompson, 2012)
Resampling and rescaling of weights (Kovacevic et al. 2006)
Obtain MLM parameter estimates and residuals with unweighted ML/REML
Reflate residuals
Builds on previous work
Pseudopopulation (Wang & Thompson, 2012)
Resampling and rescaling of weights (Kovacevic et al. 2006)
Obtain MLM parameter estimates and residuals with unweighted ML/REML
Reflate residuals
Sample with replacement and weights:
Builds on previous work
Pseudopopulation (Wang & Thompson, 2012)
Resampling and rescaling of weights (Kovacevic et al. 2006)
Obtain MLM parameter estimates and residuals with unweighted ML/REML
Reflate residuals
Sample with replacement and weights:
4, 5, 6. Form new responses, refit with unweighted ML, obtain bootstrap distributions
Superpopulation: Yij=β0+β1X1ij+β2X2j+u0j+eij
Superpopulation: Yij=β0+β1X1ij+β2X2j+u0j+eij
Finite population: Jpop=500 clusters and npop=100 observations for each cluster
Superpopulation: Yij=β0+β1X1ij+β2X2j+u0j+eij
Finite population: Jpop=500 clusters and npop=100 observations for each cluster
Informative sampling
Two lv-2 strata: 70% for u0j>0, 30% for u0j<0
Two lv-1 strata: 70% for eij>0, 30% for eij<0
Factor | Levels |
---|---|
ICC | 0.05, 0.2, 0.5 |
Sampling fraction (both lv 1 and lv 2) | 0.1, 0.5 |
Distributions of random effects/errors | normal, χ2 with df = 2 |
Selection at lv 2 | non-informative, informative |
Selection at lv 1 | non-informative, informative |
Factor | Levels |
---|---|
ICC | 0.05, 0.2, 0.5 |
Sampling fraction (both lv 1 and lv 2) | 0.1, 0.5 |
Distributions of random effects/errors | normal, χ2 with df = 2 |
Selection at lv 2 | non-informative, informative |
Selection at lv 1 | non-informative, informative |
Total = 48 conditions
Analyses: Unweighted ML, MPML (effective weights), bootstrap
Evaluation: bias, coverage rates of 95% CI
Point estimates of fixed effects of X1 and X2 (β1 and β2) were close to unbiased
Coverage for β1 was close to 95%
Bootstrap performed similarly or better than MPML (intercept, level-2 effect)
Bootstrap more robust for nonnormal data (level-2 variance component)
Bootstrap performed similarly or better than MPML (intercept, level-2 effect)
Bootstrap more robust for nonnormal data (level-2 variance component)
Bootstrap performed similarly or better than MPML (intercept, level-2 effect)
Bootstrap more robust for nonnormal data (level-2 variance component)
Bias also found in level-1 effect when unequal selection was not accounted for
No convergence issue for bootstrap; MPML has low convergence rate (0.59 to 0.76) with small samples and small ICC
Bootstrap performed similarly or better than MPML (intercept, level-2 effect)
Bootstrap more robust for nonnormal data (level-2 variance component)
Bias also found in level-1 effect when unequal selection was not accounted for
No convergence issue for bootstrap; MPML has low convergence rate (0.59 to 0.76) with small samples and small ICC
Bootstrap slightly better for lv-2 predictor; MPML slightly better for lv-1 predictor
Bootstrap performed similarly or better than MPML (intercept, level-2 effect)
Bootstrap more robust for nonnormal data (level-2 variance component)
Bias also found in level-1 effect when unequal selection was not accounted for
No convergence issue for bootstrap; MPML has low convergence rate (0.59 to 0.76) with small samples and small ICC
Bootstrap slightly better for lv-2 predictor; MPML slightly better for lv-1 predictor
Bootstrap gave better variance components estimates
# Install developmental version of the bootmlm packageremotes::install_github("marklhc/bootmlm", ref = "weighted_boot")# Load required packageslibrary(bootmlm)library(boot)library(lme4)# Unweighted MLm1 <- lmer(SC17Q01 ~ ISEI_m + male + (1 | Sch_ID), data = PISA, REML = FALSE)# Weighted residual bootstrapboo <- bootstrap_mer( m1, FUN = function(x) { c(x@beta, c(x@theta ^ 2, 1) * sigma(x) ^ 2) }, nsim = 999L, type = "residual_cgr", w1 = PISA$W_FSTUWT, # unconditional student weights w2 = unique(PISA[c("Sch_ID", "WNRSCHBW")])$WNRSCHBW # school weights)# Print the outputboo # bootstrap resultscolMeans(boo$t) # parameter estimatesapply(boo$t, 2, sd) # bootstrap SE# Percentile intervalsboot.ci(boo, type = "perc", index = 1L)boot.ci(boo, type = "perc", index = 2L)boot.ci(boo, type = "perc", index = 3L)boot.ci(boo, type = "perc", index = 4L)boot.ci(boo, type = "perc", index = 5L)
Multilevel weighted bootstrap is a good alternative to MPML to handle sampling weights
Especially when MPML does not converge (usually with small sample and ICC)
when normality may not hold
Multilevel weighted bootstrap is a good alternative to MPML to handle sampling weights
Especially when MPML does not converge (usually with small sample and ICC)
when normality may not hold
Multilevel weighted bootstrap is a good alternative to MPML to handle sampling weights
Especially when MPML does not converge (usually with small sample and ICC)
when normality may not hold
With bootstrap, statistical inference for Cov(u0,u1) is not trustworthy (reason not clear)
Researchers should conduct sensitivity analysis with different methods (ML, MPML, weighted bootstrap)
Asparouhov, T. (2006). General multi-level modeling with sampling weights. Communications in Statistics—Theory and Methods, 35(3), 439-460.
Kovacevic, M. S., Huang, R., & You, Y. (2006). Bootstrapping for variance estimation in multi-level models fitted to survey data. ASA Proceedings of the Survey Research Methods Section, 3260-3269.
Lai, M. H. C. (2020). Bootstrap confidence intervals for multilevel standardized effect size. Multivariate Behavioral Research. Advance online publication. https://doi.org/10.1080/00273171.2020.1746902
Rabe‐Hesketh, S., & Skrondal, A. (2006). Multilevel modelling of complex survey data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169(4), 805-827.
Organization for Economic Co-operation and Development (2000) Manual for the PISA 2000 Database. Paris: Organization for Economic Co-operation and Development. Retrieved from http://www.pisa.oecd.org/dataoecd/53/18/33688135.pdf
Pfeffermann, D., Skinner, C. J., Holmes, D. J., Goldstein, H., Rasbash, J. (1998). Weighting for unequal selection probabilities in multi-level models. Journal of the Royal Statistics Society: Series B (Statistical Methodology), 60(1): 23–56.
Stapleton, L. (2002). The incorporation of sample weights into multilevel structural equation models. Structural Equation Modeling, 9(4): 475–502.
Wang, Z., & Thompson, M. E. (2012). A resampling approach to estimate variance components of multilevel models. Canadian Journal of Statistics, 40(1), 150–171. https://doi.org/10.1002/cjs.10136
Slides created via the R package xaringan.
For questions, please email Wen (wluo@tamu.edu) or Mark (hokchiol@usc.edu).
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |