Scaling and Standard Errors in SEM

In this post, I demonstrate why rescaling a coefficient (i.e., multiplied/divided by a constant) is different from “standardizing” a coefficient (i.e., multiplied/divided by the sample standard deviation, which is a random variable) in SEM.

See also this discussion on StackExchange and the article When constraints interact: A caution about reference variables, identification constraints, and scale dependencies in structural equation modeling.

Load Packages

library(lavaan)
## This is lavaan 0.6-12
## lavaan is FREE software! Please report any bugs.
# library(performance)
# library(parameters)
library(modelsummary)

Example

Different Scaling Choices

In the following, I manipulate the scaling constraints on (1) the average loading, (2) the first loading, (3) the second loading, and (4) the latent variance, so that they all have the same sample unit for the latent variable.

1. Effect coding (average loading = 1)

model <- '
  # latent variable definitions
    ind60 =~ x1 + x2 + x3
    dem60 =~ y1 + y2 + y3 + y4
  # regression
    dem60 ~ ind60
'
fit <- sem(model, data = PoliticalDemocracy, effect.coding = "loadings")
## Warning in lav_partable_add_bounds(partable = lavpartable, lavpta = lavpta, :
## lavaan WARNING: automatic bounds not available (yet) if effect.coding is used
## Warning in lav_partable_add_bounds(partable = lavpartable, lavh1 = lavh1, :
## lavaan WARNING: automatic bounds not available (yet) if effect.coding is used
parameterEstimates(fit)
##      lhs op   rhs   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~    x1 0.600 0.030 19.833  0.000    0.541    0.660
## 2  ind60 =~    x2 1.308 0.044 29.622  0.000    1.222    1.395
## 3  ind60 =~    x3 1.091 0.050 21.634  0.000    0.993    1.190
## 4  dem60 =~    y1 0.809 0.072 11.309  0.000    0.669    0.950
## 5  dem60 =~    y2 1.148 0.099 11.575  0.000    0.954    1.343
## 6  dem60 =~    y3 0.887 0.094  9.444  0.000    0.703    1.071
## 7  dem60 =~    y4 1.156 0.080 14.400  0.000    0.998    1.313
## 8  dem60  ~ ind60 1.067 0.263  4.056  0.000    0.552    1.583
## 9     x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
## 10    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
## 11    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
## 12    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
## 13    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
## 14    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
## 15    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
## 16 ind60 ~~ ind60 1.245 0.215  5.778  0.000    0.823    1.667
## 17 dem60 ~~ dem60 5.271 1.031  5.111  0.000    3.249    7.292

2. Fix first loading

# Get estimates from first model
coef(fit)[c("ind60=~x1", "dem60=~y1")]
## ind60=~x1 dem60=~y1 
## 0.6002121 0.8094457
model <- '
  # latent variable definitions
    ind60 =~ 0.6002121 * x1 + x2 + x3
    dem60 =~ 0.8094457 * y1 + y2 + y3 + y4
  # regression
    dem60 ~ ind60
'
fit1 <- sem(model, data = PoliticalDemocracy)
parameterEstimates(fit1)
##      lhs op   rhs   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~    x1 0.600 0.000     NA     NA    0.600    0.600
## 2  ind60 =~    x2 1.308 0.084 15.633  0.000    1.144    1.472
## 3  ind60 =~    x3 1.091 0.091 11.966  0.000    0.913    1.270
## 4  dem60 =~    y1 0.809 0.000     NA     NA    0.809    0.809
## 5  dem60 =~    y2 1.148 0.164  7.005  0.000    0.827    1.470
## 6  dem60 =~    y3 0.887 0.139  6.396  0.000    0.615    1.158
## 7  dem60 =~    y4 1.156 0.139  8.341  0.000    0.884    1.427
## 8  dem60  ~ ind60 1.067 0.281  3.801  0.000    0.517    1.618
## 9     x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
## 10    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
## 11    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
## 12    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
## 13    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
## 14    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
## 15    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
## 16 ind60 ~~ ind60 1.245 0.241  5.170  0.000    0.773    1.717
## 17 dem60 ~~ dem60 5.271 1.335  3.947  0.000    2.653    7.888

3. Fix second loading

# Get estimates from first model
coef(fit)[c("ind60=~x2", "dem60=~y2")]
## ind60=~x2 dem60=~y2 
##  1.308404  1.148334
model2 <- '
  # latent variable definitions
    ind60 =~ NA * x1 + 1.308404 * x2 + x3
    dem60 =~ NA * y1 + 1.148334 * y2 + y3 + y4
  # regression
    dem60 ~ ind60
'
fit2 <- sem(model2, data = PoliticalDemocracy)
parameterEstimates(fit2)  # SE changed!
##      lhs op   rhs   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~    x1 0.600 0.038 15.633  0.000    0.525    0.675
## 2  ind60 =~    x2 1.308 0.000     NA     NA    1.308    1.308
## 3  ind60 =~    x3 1.091 0.083 13.149  0.000    0.929    1.254
## 4  dem60 =~    y1 0.809 0.116  7.005  0.000    0.583    1.036
## 5  dem60 =~    y2 1.148 0.000     NA     NA    1.148    1.148
## 6  dem60 =~    y3 0.887 0.146  6.059  0.000    0.600    1.173
## 7  dem60 =~    y4 1.156 0.151  7.674  0.000    0.860    1.451
## 8  dem60  ~ ind60 1.067 0.284  3.761  0.000    0.511    1.623
## 9     x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
## 10    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
## 11    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
## 12    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
## 13    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
## 14    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
## 15    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
## 16 ind60 ~~ ind60 1.245 0.218  5.705  0.000    0.817    1.673
## 17 dem60 ~~ dem60 5.271 1.447  3.642  0.000    2.434    8.107

4. Fix latent variances

# Get estimates from first model
coef(fit)[c("ind60~~ind60", "dem60~~dem60")]
## ind60~~ind60 dem60~~dem60 
##     1.245068     5.270542
model3 <- '
  # latent variable definitions
    ind60 =~ NA * x1 + x2 + x3
    dem60 =~ NA * y1 + y2 + y3 + y4
  # latent variances
    ind60 ~~ 1.245068 * ind60
    dem60 ~~ 5.270542 * dem60
  # regression
    dem60 ~ ind60
'
fit3 <- sem(model3, data = PoliticalDemocracy)
parameterEstimates(fit3)  # SE changed again!
##      lhs op   rhs   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~    x1 0.600 0.058 10.340  0.000    0.486    0.714
## 2  ind60 =~    x2 1.308 0.115 11.410  0.000    1.084    1.533
## 3  ind60 =~    x3 1.091 0.115  9.477  0.000    0.866    1.317
## 4  dem60 =~    y1 0.809 0.103  7.893  0.000    0.608    1.010
## 5  dem60 =~    y2 1.148 0.158  7.285  0.000    0.839    1.457
## 6  dem60 =~    y3 0.887 0.134  6.606  0.000    0.624    1.150
## 7  dem60 =~    y4 1.156 0.126  9.169  0.000    0.909    1.403
## 8  ind60 ~~ ind60 1.245 0.000     NA     NA    1.245    1.245
## 9  dem60 ~~ dem60 5.271 0.000     NA     NA    5.271    5.271
## 10 dem60  ~ ind60 1.067 0.295  3.619  0.000    0.489    1.645
## 11    x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
## 12    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
## 13    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
## 14    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
## 15    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
## 16    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
## 17    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
msummary(list(`Effect coding` = fit,
              `Fix item 1` = fit1,
              `Fix item 2` = fit2,
              `Fix variance` = fit3))
Effect coding Fix item 1 Fix item 2 Fix variance
ind60 =~ x1 0.600 0.600 0.600 0.600
(0.030) (0.000) (0.038) (0.058)
ind60 =~ x2 1.308 1.308 1.308 1.308
(0.044) (0.084) (0.000) (0.115)
ind60 =~ x3 1.091 1.091 1.091 1.091
(0.050) (0.091) (0.083) (0.115)
dem60 =~ y1 0.809 0.809 0.809 0.809
(0.072) (0.000) (0.116) (0.103)
dem60 =~ y2 1.148 1.148 1.148 1.148
(0.099) (0.164) (0.000) (0.158)
dem60 =~ y3 0.887 0.887 0.887 0.887
(0.094) (0.139) (0.146) (0.134)
dem60 =~ y4 1.156 1.156 1.156 1.156
(0.080) (0.139) (0.151) (0.126)
dem60 ~ ind60 1.067 1.067 1.067 1.067
(0.263) (0.281) (0.284) (0.295)
Num.Obs. 75 75 75 75
AIC 1906.2 1906.2 1906.2 1906.2
BIC 1940.9 1940.9 1940.9 1940.9

Note that the parameter estimates are all the same, but the standard errors are all different. This has implications for statistical power when doing latent variable analysis.

Standardized Coefficients

Delta method

Directly in lavaan

msummary(list(`Fix item 1` = fit,
              `Fix item 2` = fit2,
              `Fix variance` = fit3),
         standardize = TRUE)
Fix item 1 Fix item 2 Fix variance
ind60 =~ x1 0.920 0.920 0.920
(0.023) (0.023) (0.023)
ind60 =~ x2 0.973 0.973 0.973
(0.017) (0.017) (0.017)
ind60 =~ x3 0.872 0.872 0.872
(0.031) (0.031) (0.031)
dem60 =~ y1 0.804 0.804 0.804
(0.051) (0.051) (0.051)
dem60 =~ y2 0.757 0.757 0.757
(0.058) (0.058) (0.058)
dem60 =~ y3 0.704 0.704 0.704
(0.066) (0.066) (0.066)
dem60 =~ y4 0.898 0.898 0.898
(0.039) (0.039) (0.039)
dem60 ~ ind60 0.460 0.460 0.460
(0.100) (0.100) (0.100)
Num.Obs. 75 75 75
AIC 1906.2 1906.2 1906.2
BIC 1940.9 1940.9 1940.9

1. Define new parameters in the model

model_std <- '
  # latent variable definitions
    ind60 =~ x1 + x2 + x3
    dem60 =~ y1 + y2 + y3 + y4
  # latent variances
    ind60 ~~ v1 * ind60
    dem60 ~~ ev2 * dem60
  # regression
    dem60 ~ b * ind60
  # define standardized coefficients
    v2 := b^2 * v1 + ev2
    beta := b * sqrt(v1 / v2)
'
fit_std <- sem(model_std, data = PoliticalDemocracy)
parameterEstimates(fit_std)  # same standardized coefficient
##      lhs op           rhs label   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~            x1       1.000 0.000     NA     NA    1.000    1.000
## 2  ind60 =~            x2       2.180 0.139 15.633  0.000    1.907    2.453
## 3  ind60 =~            x3       1.818 0.152 11.966  0.000    1.520    2.116
## 4  dem60 =~            y1       1.000 0.000     NA     NA    1.000    1.000
## 5  dem60 =~            y2       1.419 0.203  7.005  0.000    1.022    1.816
## 6  dem60 =~            y3       1.095 0.171  6.396  0.000    0.760    1.431
## 7  dem60 =~            y4       1.428 0.171  8.341  0.000    1.092    1.763
## 8  ind60 ~~         ind60    v1 0.449 0.087  5.170  0.000    0.279    0.619
## 9  dem60 ~~         dem60   ev2 3.453 0.875  3.947  0.000    1.738    5.168
## 10 dem60  ~         ind60     b 1.439 0.379  3.801  0.000    0.697    2.181
## 11    x1 ~~            x1       0.081 0.020  4.123  0.000    0.043    0.120
## 12    x2 ~~            x2       0.120 0.072  1.676  0.094   -0.020    0.261
## 13    x3 ~~            x3       0.467 0.091  5.153  0.000    0.289    0.644
## 14    y1 ~~            y1       2.404 0.516  4.663  0.000    1.394    3.415
## 15    y2 ~~            y2       6.552 1.289  5.082  0.000    4.025    9.078
## 16    y3 ~~            y3       5.363 0.996  5.384  0.000    3.410    7.315
## 17    y4 ~~            y4       2.137 0.719  2.973  0.003    0.728    3.546
## 18    v2 :=    b^2*v1+ev2    v2 4.382 1.089  4.024  0.000    2.248    6.517
## 19  beta := b*sqrt(v1/v2)  beta 0.460 0.100  4.593  0.000    0.264    0.657

2. Use constraints to fix variances

model_con <- '
  # latent variable definitions
    ind60 =~ NA * x1 + x2 + x3
    dem60 =~ NA * y1 + y2 + y3 + y4
  # latent variances
    ind60 ~~ 1 * ind60
    dem60 ~~ ev2 * dem60
  # regression
    dem60 ~ beta * ind60
  # constraints
    ev2 == 1 - beta^2
'
fit_con <- sem(model_con, data = PoliticalDemocracy)
parameterEstimates(fit_con)  # same standardized coefficient
##      lhs op   rhs label   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~    x1       0.670 0.065 10.340  0.000    0.543    0.797
## 2  ind60 =~    x2       1.460 0.128 11.410  0.000    1.209    1.711
## 3  ind60 =~    x3       1.218 0.129  9.477  0.000    0.966    1.470
## 4  dem60 =~    y1       2.093 0.260  8.049  0.000    1.584    2.603
## 5  dem60 =~    y2       2.970 0.401  7.406  0.000    2.184    3.756
## 6  dem60 =~    y3       2.293 0.342  6.696  0.000    1.622    2.964
## 7  dem60 =~    y4       2.989 0.315  9.494  0.000    2.372    3.606
## 8  ind60 ~~ ind60       1.000 0.000     NA     NA    1.000    1.000
## 9  dem60 ~~ dem60   ev2 0.788 0.092  8.535  0.000    0.607    0.969
## 10 dem60  ~ ind60  beta 0.460 0.100  4.593  0.000    0.264    0.657
## 11    x1 ~~    x1       0.081 0.020  4.123  0.000    0.043    0.120
## 12    x2 ~~    x2       0.120 0.072  1.676  0.094   -0.020    0.261
## 13    x3 ~~    x3       0.467 0.091  5.153  0.000    0.289    0.644
## 14    y1 ~~    y1       2.404 0.516  4.663  0.000    1.394    3.415
## 15    y2 ~~    y2       6.552 1.289  5.082  0.000    4.025    9.078
## 16    y3 ~~    y3       5.363 0.996  5.384  0.000    3.410    7.315
## 17    y4 ~~    y4       2.137 0.719  2.973  0.003    0.729    3.546

3. (Incorrect) constraining the first indicators

coef(fit_con)[c("ind60=~x1", "dem60=~y1")]
## ind60=~x1 dem60=~y1 
## 0.6697334 2.0934489
model_conl <- '
  # latent variable definitions
    ind60 =~ 0.6697334 * x1 + x2 + x3
    dem60 =~ 2.0934489 * y1 + y2 + y3 + y4
  # regression
    dem60 ~ beta * ind60
'
fit_conl <- sem(model_conl, data = PoliticalDemocracy)
parameterEstimates(fit_conl)  # same standardized coefficient
##      lhs op   rhs label   est    se      z pvalue ci.lower ci.upper
## 1  ind60 =~    x1       0.670 0.000     NA     NA    0.670    0.670
## 2  ind60 =~    x2       1.460 0.093 15.633  0.000    1.277    1.643
## 3  ind60 =~    x3       1.218 0.102 11.966  0.000    1.018    1.417
## 4  dem60 =~    y1       2.093 0.000     NA     NA    2.093    2.093
## 5  dem60 =~    y2       2.970 0.424  7.005  0.000    2.139    3.801
## 6  dem60 =~    y3       2.293 0.359  6.396  0.000    1.590    2.996
## 7  dem60 =~    y4       2.989 0.358  8.341  0.000    2.286    3.691
## 8  dem60  ~ ind60  beta 0.460 0.121  3.801  0.000    0.223    0.698
## 9     x1 ~~    x1       0.081 0.020  4.123  0.000    0.043    0.120
## 10    x2 ~~    x2       0.120 0.072  1.676  0.094   -0.020    0.261
## 11    x3 ~~    x3       0.467 0.091  5.153  0.000    0.289    0.644
## 12    y1 ~~    y1       2.404 0.516  4.663  0.000    1.394    3.415
## 13    y2 ~~    y2       6.552 1.289  5.082  0.000    4.025    9.078
## 14    y3 ~~    y3       5.363 0.996  5.384  0.000    3.410    7.315
## 15    y4 ~~    y4       2.137 0.719  2.973  0.003    0.728    3.546
## 16 ind60 ~~ ind60       1.000 0.193  5.170  0.000    0.621    1.379
## 17 dem60 ~~ dem60       0.788 0.200  3.947  0.000    0.397    1.179
msummary(list(`Define new par` = fit_std,
              `Constraints` = fit_con,
              `Fix loading` = fit_conl))
Define new par Constraints Fix loading
ind60 =~ x1 1.000 0.670 0.670
(0.000) (0.065) (0.000)
ind60 =~ x2 2.180 1.460 1.460
(0.139) (0.128) (0.093)
ind60 =~ x3 1.818 1.218 1.218
(0.152) (0.129) (0.102)
dem60 =~ y1 1.000 2.093 2.093
(0.000) (0.260) (0.000)
dem60 =~ y2 1.419 2.970 2.970
(0.203) (0.401) (0.424)
dem60 =~ y3 1.095 2.293 2.293
(0.171) (0.342) (0.359)
dem60 =~ y4 1.428 2.989 2.989
(0.171) (0.315) (0.358)
dem60 ~ ind60 1.439 0.460 0.460
(0.379) (0.100) (0.121)
v2 × = b^2*v1+ev2 4.382
(1.089)
beta × = b*sqrt(v1/v2) 0.460
(0.100)
Num.Obs. 75 75 75
AIC 1906.2 1906.2 1906.2
BIC 1940.9 1940.9 1940.9

As can be shown, while the first two methods give the same standardized beta in terms of estimates and standard errors, the third does not as it does not account for the uncertainty in the standard deviation.

Hok Chio (Mark) Lai 黎學昭
Hok Chio (Mark) Lai 黎學昭
Assistant Professor of Psychology (Quantitative Methods)

My research interests include statistics, multilevel and latent variable models, and psychometrics.

comments powered by Disqus

Related