A New Effect Size Statistic for Measurement Non-Invariance With Multiple Groups and Multiple Grouping Variables

Acknowledgements

This research is based on work supported by the National Science Foundation (Grant 2141790)

The paper has been accepted for publication in Structural Equation Modeling

Measurement Invariance

The same construct is measured in the same way across groups, time, etc

Violation of MI spurious/biased group differences

Most studies focused on binary conclusions (e.g., invariant or not)
Effect sizes rarely discussed or reported

“Repliction Crisis” in MI Studies

Zhang (2022): Synthesis of 32 studies evaluating gender invariance of the Center for Epidemiologic Studies Depression Scale (CES-D)
- Drastic differences in MI findings across studies
- All 20 items were found noninvariant in at least one study
- Only a few provided sufficient information to compute effect sizes

Cohen’s analogue

(Nye et al., 2019; Nye & Drasgow, 2011)

= expected item score

Standardized mean difference in expected item scores due to noninvariance

Multiple Groups and Grouping Variables

Cross-cultural, Ethnicity Gender, etc

Variability of Expected Item Score () at a Given .

Cohen’s Analogue

For groups, each of size and total sample size ,

Expected (squared) deviation from grand mean due to noninvariance, in standardized units

For both continuous and categorical items
Like Cohen’s , 2 = for two groups
< .10 indicate negligible effect size

Empirical Example 2: Alcohol Beliefs Scale

Data: 1,148 U.S. undergraduates (2 gender × 3 ethnicity groups), from Lui (2019)
Measure: College Life Alcohol Salience Scale, 15 items

Osberg et al. (2010), p. 6, Table 2

library(pinsearch)
# Specification search for 
# partial invariance
ps <- pinSearch(
  mod, data = dat,
  group = "group",
  estimator = "MLR",
  missing = "fiml",
  type = "residual.covariances")
# Obtain omnibus fmacs
# effect size (for lavaan objects)
(f_omni <- pin_effsize(ps[[1]]))library(pinsearch)
# Specification search for 
# partial invariance
ps <- pinSearch(
  mod, data = dat,
  group = "group",
  estimator = "MLR",
  missing = "fiml",
  type = "residual.covariances")
# Obtain omnibus fmacs
# effect size (for lavaan objects)
(f_omni <- pin_effsize(ps[[1]]))

8 items with noninvariant intercepts
ranged from 0.06 to 0.15

Contrast: Main and Interaction

Like ANOVA, we can decompose the effect sizes into main and interaction effects
- Not orthogonal with unbalanced sample sizes

f_MACS effect sizes for the CLASS items
	Overall	Gender	Ethnicity	Gender x Ethnicity
class1	0.10	0.03	0.05	0.05
class2	0.10	0.08	0.06	0.06
class3	0.07	0.03	0.04	0.04
class4	0.11	0.04	0.09	0.05
class5	0.06	0.03	0.04	0.04
class7	0.08	0.00	0.07	0.00
class8	0.09	0.04	0.05	0.05
class14	0.15	0.04	0.07	0.07

Other Supported Features

Test-level (unweighted or weighted sums)

pin_effsize(..., item_weights = rep(1, 15))

Bootstrap bias correction and confidence intervals

Conclusion

: A versatile effect size for quantifying noninvariance across multiple groups and variables.
Impact: Enhances transparency, replicability, and practical significance in MI research.

Questions?

Thank you for your attention!

References

Lui, P. P. (2019). College alcohol beliefs: Measurement invariance, mean differences, and correlations with alcohol use outcomes across sociodemographic groups. Journal of Counseling Psychology, 66(4), 487–495. https://doi.org/10.1037/cou0000338

Nye, C. D., Bradburn, J., Olenick, J., Bialko, C., & Drasgow, F. (2019). How big are my effects? Examining the magnitude of effect sizes in studies of measurement equivalence. Organizational Research Methods, 22(3), 678–709. https://doi.org/10.1177/1094428118761122

Nye, C. D., & Drasgow, F. (2011). Effect size indices for analyses of measurement equivalence: Understanding the practical importance of differences between groups. Journal of Applied Psychology, 96, 966–980. https://doi.org/10.1037/a0022955

Osberg, T. M., Atkins, L., Buchholz, L., Shirshova, V., Swiantek, A., Whitley, J., Hartman, S., & Oquendo, N. (2010). Development and validation of the College Life Alcohol Salience Scale: A measure of beliefs about the role of alcohol in college life. Psychology of Addictive Behaviors, 24(1), 1–12. https://doi.org/10.1037/a0018197

Zhang, G. (2022). A systematic review of measurement invariance research of the CES-D scale across gender [Unpublished master’s thesis, University of Southern California]. University of Southern California. https://doi.org/10.25549/usctheses-oUC111375873