Obtaining reliability for daily diary data using multilevel factor analysisHok Chio (Mark) Lai, Feng Ji, Shi ChenUniversity of Southern California, University of California, Berkeley, Northern Arizona University2021 IMPS1 / 21

Daily Diary Data (Positive Affect)

2 / 21

Multiple Items

3 / 21

Composite/Scale Scores

4 / 21

Person Mean And Deviation

5 / 21

Reliability is Not Commonly Reported for Diary Data

PsycInfo ("daily diary" and "emotion", peer-reviewed, 2020 July 1 to December 31)

15 articles; 14 with diary measures; 11 with multi-item measures
- Within-person/change reliability: 4
- Single reliability coefficient: 3
- None reported: 4

6 / 21

Reliability is Not Commonly Reported for Diary Data

PsycInfo ("daily diary" and "emotion", peer-reviewed, 2020 July 1 to December 31)

15 articles; 14 with diary measures; 11 with multi-item measures
- Within-person/change reliability: 4
- Single reliability coefficient: 3
- None reported: 4

Approaches for level-specific reliability

Generalizability theory (GT; Cranford, et al., 2006; Shrout, et al., 2012)
Multilevel factor analysis (MFA; Geldhof, et al., 2014; Lai, 2021)

6 / 21

OverviewGT as a special case of MFAReliability of person means (with sampling error)Reliability of within-person deviations/Reliability of changeDo we have enough items?7 / 21

MFA

8 / 21

MFA

"Unconstrained" Multilevel Factor Model

$i$ indexes person; $t$ indexes time

$Y_{t i} = ν + \underset{between model}{\underset{⏟}{λ^{b} η_{i}^{b} + ϵ_{i}^{b}}} + \underset{within model}{\underset{⏟}{λ_{i}^{w} η_{t i}^{w} + ϵ_{t i}^{w}}}$

8 / 21

GT as MFA

observations : (item × person)

Here I assume no day-specific variance

(Essential) Parallel

$λ_{j}^{b} = λ_{j}^{w} = 1$
Constant uniqueness: $V (ϵ_{i j}^{b}) = θ^{b}$ and $V (ϵ_{t i j}^{w}) = θ^{w}$

9 / 21

GT as MFA

GT: $Y_{t i j} = (μ + I_{j}) + P_{i} + (P I)_{i j} + (T P)_{t i} + e_{t i j}$

MFA: $Y_{t i j} = ν_{j} + λ_{j}^{b} η_{i}^{b} + ϵ_{i j}^{b} + λ_{i j}^{w} η_{t i}^{w} + ϵ_{t i j}^{w}$

10 / 21

Types of Observed Scores

Raw Composite: $Z_{t i} = \sum_{j = 1}^{p} Y_{t i j}$
Person Means: ${\bar{Z}}_{. i} = \sum_{t = 1}^{n} Z_{t i}$
Person deviation: $Z_{t i} - {\bar{Z}}_{. i}$

$n$ = number of time points

11 / 21

Reliability of Person Means (Traits)

12 / 21

Reliability of Person Means (Traits)

$Σ^{w} = {σ_{j j^{'}}^{w}}$ Within covariance

$Σ^{b} = {σ_{j j^{'}}^{b}}$ Between covariance

Lai (2021):

$α^{b} = \frac{p}{p - 1} (\frac{\sum_{j \neq j^{'}} σ_{j j^{'}}^{b}}{1^{'} Σ^{b} 1 + \underset{sampling error}{\underset{⏟}{1^{'} Σ^{w} 1 / \tilde{n}}}})$

Sample person mean of $n$ time points is not the same as the true person mean

Between reliability by Geldhof et al. (2014) ignores this sampling error

12 / 21

Reliability of Within-Person Deviations (States)

Same as reliability of change/fluctuations

13 / 21

Reliability of Within-Person Deviations (States)

Same as reliability of change/fluctuations

Lai (2021):

$α^{w} = \frac{p}{p - 1} (\frac{\sum_{k \neq k^{'}} σ_{k k^{'}}^{w}}{1^{'} Σ^{w} 1})$

Between and within $ω$ reliability can be obtained by allowing different loadings across items

13 / 21

Example: Midlife in the United States

Data from MIDUS 2: Daily Stress Project, 2004-2009 (Ryff et al., 2009)

2,022 participants, 8 days each

14 / 21

Example: Midlife in the United States

Data from MIDUS 2: Daily Stress Project, 2004-2009 (Ryff et al., 2009)

2,022 participants, 8 days each
Target construct: Positive affect

Item	Wording
b2dc24	Did you feel attentive?
b2dc25	Did you feel proud?
b2dc26	Did you feel active?
b2dc27	Did you feel confident?

14 / 21

Est $ICC (η) = .778$

Composite	Est $α$	95% CI	Est $ω$	95% CI
Raw	.832	[.820, .843]	.829	[.817, .841]
Within	.646	[.628, .664]	.645	[.625, .662]
Between	.862	[.849, .873]	.860	[.817, .872]

15 / 21

Equivalence of GT and Constrained MFA

GT:

Reliability of change (Cranford et al., 2006): $V (P I) / [V (P I) + V (e) / p]$

#>    Rc 
#> 0.646

Constrained MFA:

$ρ^{w}$ (Geldhof et al., 2014; Lai, 2021): $p^{2} ψ^{w} / (p^{2} ψ^{w} + p θ^{w})$

#> rho^w 
#> 0.646

$p$ = number of items

16 / 21

R Function `multilevel_alpha()`

https://github.com/marklhc/mcfa_reliability_supp/blob/master/multilevel_alpha.R

multilevel_alpha(d2_var[c("b2dc24", "b2dc25", "b2dc26", "b2dc27")], 
                 id = d2_var$m2id)

#> Parallel analysis suggests that the number of factors =  NA  and the number of components =  1 
#> Parallel analysis suggests that the number of factors =  NA  and the number of components =  1

#> $alpha
#>   alpha2l    alphab    alphaw 
#> 0.8318040 0.8616034 0.6460802 
#> 
#> $alpha_ci
#>              2.5%     97.5%
#> alpha2l 0.8202014 0.8425253
#> alphab  0.8488781 0.8734602
#> alphaw  0.6269076 0.6638443
#> 
#> $omega
#>   omega2l    omegab    omegaw 
#> 0.8293460 0.8595804 0.6445232 
#> 
#> $omega_ci
#>              2.5%     97.5%
#> omega2l 0.8173008 0.8408357
#> omegab  0.8461517 0.8719812
#> omegaw  0.6259997 0.6620339
#> 
#> $ncomp
#>  within between 
#>       1       1

17 / 21

Do We Have Enough Items to Capture Change?

With 4 items, within-person reliability is only .646

Spearman-Brown formula:

Need 6 items for $α^{w} > .70$ , 9 items for $α^{w} > .80$

18 / 21

Conclusion

Reliability information needs to be more consistently reported for diary studies
- And tools are needed to make the computation more accessible
Using one or two items may not allow reliable examination of change
- Esp when ICC is high
- Choosing items with higher loadings may help
- More scale validation in daily diary context helps researchers plan for sufficient reliability

19 / 21

References

Cranford, J. A. et al. (2006). "A procedure for evaluating sensitivity to within-person change: Can mood measures in diary studies detect change reliably?" In: Personality and Social Psychology Bulletin 32.7, pp. 917-929. DOI: 10.1177/0146167206287721.

Geldhof, G. J. et al. (2014). "Reliability estimation in a multilevel confirmatory factor analysis framework". In: Psychological Methods 19.1, pp. 72-91. DOI: 10.1037/a0032138.

Lai, M. H. C. (2021). "Composite reliability of multilevel data: It’s about observed scores and construct meanings." In: Psychological Methods 26 (1). DOI: 10.1037/met0000287.

Shrout, P. E. et al. (2012). "Psychometrics". In: Handbook of Research Methods for Studying Daily Life. New York, NY, US: The Guilford Press, pp. 302-320. ISBN: 978-1-60918-747-7 978-1-60918-749-1.

20 / 21

Thanks!

Slides created via the R package xaringan.

The chakra comes from remark.js, knitr, and R Markdown.

21 / 21

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help

Obtaining reliability for daily diary data using multilevel factor analysis

Hok Chio (Mark) Lai, Feng Ji, Shi Chen

University of Southern California, University of California, Berkeley, Northern Arizona University

2021 IMPS

Daily Diary Data (Positive Affect)

Multiple Items

Composite/Scale Scores

Person Mean And Deviation

Reliability is Not Commonly Reported for Diary Data

Reliability is Not Commonly Reported for Diary Data

Overview

GT as a special case of MFA

Reliability of person means (with sampling error)

Reliability of within-person deviations/Reliability of change

Do we have enough items?

MFA

MFA

"Unconstrained" Multilevel Factor Model

GT as MFA

GT as MFA

Types of Observed Scores

Reliability of Person Means (Traits)

Reliability of Person Means (Traits)

Reliability of Within-Person Deviations (States)

Reliability of Within-Person Deviations (States)

Example: Midlife in the United States

Example: Midlife in the United States

Equivalence of GT and Constrained MFA

R Function multilevel_alpha()

Do We Have Enough Items to Capture Change?

Conclusion

References

Thanks!

Daily Diary Data (Positive Affect)

Help

R Function `multilevel_alpha()`