Bootstrap confidence intervals for multilevel standardized effect size


Although many methodologists and professional organizations have urged applied researchers to compute and report effect size measures accompanying tests of statistical significance, discussions on obtaining confidence interval (CI) for effect size with clustered/multilevel data have been scarce. In this paper, I explore the bootstrap as a viable and accessible alternative for obtaining CIs for multilevel standardized mean difference effect size for cluster-randomized trials. A simulation was carried out to compare 17 analytic and bootstrap procedures for constructing CIs for multilevel effect size, in terms of empirical coverage rates and width, for both normal and nonnormal data. Results showed that, overall, the residual bootstrap with studentized CI had the best coverage rates (94.75% on average), whereas the residual bootstrap with basic CI had better coverage in small samples. These two procedures for constructing CIs showed better coverage than using analytic methods for both normal and nonnormal data. In addition, I provide an illustrative example showing how bootstrap CIs for multilevel effect size can be easily obtained using the statistical software R and the R package ‘bootmlm’. I strongly encourage applied researchers to report CIs to adequately convey the uncertainty of their effect size estimates.

Multivariate Behavior Research, 56(4), 558–578
comments powered by Disqus