Scaling and Standard Errors in SEM

Statistics
Author

Mark Lai

Published

June 17, 2022

Modified

August 20, 2022

In this post, I demonstrate why rescaling a coefficient (i.e., multiplied/divided by a constant) is different from “standardizing” a coefficient (i.e., multiplied/divided by the sample standard deviation, which is a random variable) in SEM.

See also this discussion on StackExchange and the article When constraints interact: A caution about reference variables, identification constraints, and scale dependencies in structural equation modeling.

Load Packages

library(lavaan)
This is lavaan 0.6-17
lavaan is FREE software! Please report any bugs.
# library(performance)
# library(parameters)
library(modelsummary)

Example

Different Scaling Choices

In the following, I manipulate the scaling constraints on (1) the average loading, (2) the first loading, (3) the second loading, and (4) the latent variance, so that they all have the same sample unit for the latent variable.

1. Effect coding (average loading = 1)

model <- '
  # latent variable definitions
    ind60 =~ x1 + x2 + x3
    dem60 =~ y1 + y2 + y3 + y4
  # regression
    dem60 ~ ind60
'
fit <- sem(model, data = PoliticalDemocracy, effect.coding = "loadings")
parameterEstimates(fit)
     lhs op   rhs   est    se      z pvalue ci.lower ci.upper
1  ind60 =~    x1 0.600 0.030 19.833  0.000    0.541    0.660
2  ind60 =~    x2 1.308 0.044 29.622  0.000    1.222    1.395
3  ind60 =~    x3 1.091 0.050 21.634  0.000    0.993    1.190
4  dem60 =~    y1 0.809 0.072 11.309  0.000    0.669    0.950
5  dem60 =~    y2 1.148 0.099 11.575  0.000    0.954    1.343
6  dem60 =~    y3 0.887 0.094  9.444  0.000    0.703    1.071
7  dem60 =~    y4 1.156 0.080 14.400  0.000    0.998    1.313
8  dem60  ~ ind60 1.067 0.263  4.056  0.000    0.552    1.583
9     x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
10    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
11    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
12    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
13    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
14    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
15    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
16 ind60 ~~ ind60 1.245 0.215  5.778  0.000    0.823    1.667
17 dem60 ~~ dem60 5.271 1.031  5.111  0.000    3.249    7.292

2. Fix first loading

# Get estimates from first model
coef(fit)[c("ind60=~x1", "dem60=~y1")]
ind60=~x1 dem60=~y1 
0.6002121 0.8094457 
model <- '
  # latent variable definitions
    ind60 =~ 0.6002121 * x1 + x2 + x3
    dem60 =~ 0.8094457 * y1 + y2 + y3 + y4
  # regression
    dem60 ~ ind60
'
fit1 <- sem(model, data = PoliticalDemocracy)
parameterEstimates(fit1)
     lhs op   rhs   est    se      z pvalue ci.lower ci.upper
1  ind60 =~    x1 0.600 0.000     NA     NA    0.600    0.600
2  ind60 =~    x2 1.308 0.084 15.633  0.000    1.144    1.472
3  ind60 =~    x3 1.091 0.091 11.966  0.000    0.913    1.270
4  dem60 =~    y1 0.809 0.000     NA     NA    0.809    0.809
5  dem60 =~    y2 1.148 0.164  7.005  0.000    0.827    1.470
6  dem60 =~    y3 0.887 0.139  6.396  0.000    0.615    1.158
7  dem60 =~    y4 1.156 0.139  8.341  0.000    0.884    1.427
8  dem60  ~ ind60 1.067 0.281  3.801  0.000    0.517    1.618
9     x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
10    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
11    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
12    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
13    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
14    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
15    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
16 ind60 ~~ ind60 1.245 0.241  5.170  0.000    0.773    1.717
17 dem60 ~~ dem60 5.271 1.335  3.947  0.000    2.653    7.888

3. Fix second loading

# Get estimates from first model
coef(fit)[c("ind60=~x2", "dem60=~y2")]
ind60=~x2 dem60=~y2 
 1.308404  1.148334 
model2 <- '
  # latent variable definitions
    ind60 =~ NA * x1 + 1.308404 * x2 + x3
    dem60 =~ NA * y1 + 1.148334 * y2 + y3 + y4
  # regression
    dem60 ~ ind60
'
fit2 <- sem(model2, data = PoliticalDemocracy)
parameterEstimates(fit2)  # SE changed!
     lhs op   rhs   est    se      z pvalue ci.lower ci.upper
1  ind60 =~    x1 0.600 0.038 15.633  0.000    0.525    0.675
2  ind60 =~    x2 1.308 0.000     NA     NA    1.308    1.308
3  ind60 =~    x3 1.091 0.083 13.149  0.000    0.929    1.254
4  dem60 =~    y1 0.809 0.116  7.005  0.000    0.583    1.036
5  dem60 =~    y2 1.148 0.000     NA     NA    1.148    1.148
6  dem60 =~    y3 0.887 0.146  6.059  0.000    0.600    1.173
7  dem60 =~    y4 1.156 0.151  7.674  0.000    0.860    1.451
8  dem60  ~ ind60 1.067 0.284  3.761  0.000    0.511    1.623
9     x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
10    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
11    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
12    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
13    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
14    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
15    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
16 ind60 ~~ ind60 1.245 0.218  5.705  0.000    0.817    1.673
17 dem60 ~~ dem60 5.271 1.447  3.642  0.000    2.434    8.107

4. Fix latent variances

# Get estimates from first model
coef(fit)[c("ind60~~ind60", "dem60~~dem60")]
ind60~~ind60 dem60~~dem60 
    1.245068     5.270542 
model3 <- '
  # latent variable definitions
    ind60 =~ NA * x1 + x2 + x3
    dem60 =~ NA * y1 + y2 + y3 + y4
  # latent variances
    ind60 ~~ 1.245068 * ind60
    dem60 ~~ 5.270542 * dem60
  # regression
    dem60 ~ ind60
'
fit3 <- sem(model3, data = PoliticalDemocracy)
parameterEstimates(fit3)  # SE changed again!
     lhs op   rhs   est    se      z pvalue ci.lower ci.upper
1  ind60 =~    x1 0.600 0.058 10.340  0.000    0.486    0.714
2  ind60 =~    x2 1.308 0.115 11.410  0.000    1.084    1.533
3  ind60 =~    x3 1.091 0.115  9.477  0.000    0.866    1.317
4  dem60 =~    y1 0.809 0.103  7.893  0.000    0.608    1.010
5  dem60 =~    y2 1.148 0.158  7.285  0.000    0.839    1.457
6  dem60 =~    y3 0.887 0.134  6.606  0.000    0.624    1.150
7  dem60 =~    y4 1.156 0.126  9.169  0.000    0.909    1.403
8  ind60 ~~ ind60 1.245 0.000     NA     NA    1.245    1.245
9  dem60 ~~ dem60 5.271 0.000     NA     NA    5.271    5.271
10 dem60  ~ ind60 1.067 0.295  3.619  0.000    0.489    1.645
11    x1 ~~    x1 0.081 0.020  4.123  0.000    0.043    0.120
12    x2 ~~    x2 0.120 0.072  1.676  0.094   -0.020    0.261
13    x3 ~~    x3 0.467 0.091  5.153  0.000    0.289    0.644
14    y1 ~~    y1 2.404 0.516  4.663  0.000    1.394    3.415
15    y2 ~~    y2 6.552 1.289  5.082  0.000    4.025    9.078
16    y3 ~~    y3 5.363 0.996  5.384  0.000    3.410    7.315
17    y4 ~~    y4 2.137 0.719  2.973  0.003    0.728    3.546
msummary(list(`Effect coding` = fit,
              `Fix item 1` = fit1,
              `Fix item 2` = fit2,
              `Fix variance` = fit3))
Effect coding Fix item 1  Fix item 2 Fix variance
ind60 =~ x1 0.600 0.600 0.600 0.600
(0.030) (0.000) (0.038) (0.058)
ind60 =~ x2 1.308 1.308 1.308 1.308
(0.044) (0.084) (0.000) (0.115)
ind60 =~ x3 1.091 1.091 1.091 1.091
(0.050) (0.091) (0.083) (0.115)
dem60 =~ y1 0.809 0.809 0.809 0.809
(0.072) (0.000) (0.116) (0.103)
dem60 =~ y2 1.148 1.148 1.148 1.148
(0.099) (0.164) (0.000) (0.158)
dem60 =~ y3 0.887 0.887 0.887 0.887
(0.094) (0.139) (0.146) (0.134)
dem60 =~ y4 1.156 1.156 1.156 1.156
(0.080) (0.139) (0.151) (0.126)
dem60 ~ ind60 1.067 1.067 1.067 1.067
(0.263) (0.281) (0.284) (0.295)
Num.Obs. 75 75 75 75
AIC 1906.2 1906.2 1906.2 1906.2
BIC 1940.9 1940.9 1940.9 1940.9

Note that the parameter estimates are all the same, but the standard errors are all different. This has implications for statistical power when doing latent variable analysis.

Standardized Coefficients

Delta method

Directly in lavaan

msummary(list(`Fix item 1` = fit,
              `Fix item 2` = fit2,
              `Fix variance` = fit3),
         standardize = TRUE)
Fix item 1  Fix item 2 Fix variance
ind60 =~ x1 0.920 0.920 0.920
(0.023) (0.023) (0.023)
ind60 =~ x2 0.973 0.973 0.973
(0.017) (0.017) (0.017)
ind60 =~ x3 0.872 0.872 0.872
(0.031) (0.031) (0.031)
dem60 =~ y1 0.804 0.804 0.804
(0.051) (0.051) (0.051)
dem60 =~ y2 0.757 0.757 0.757
(0.058) (0.058) (0.058)
dem60 =~ y3 0.704 0.704 0.704
(0.066) (0.066) (0.066)
dem60 =~ y4 0.898 0.898 0.898
(0.039) (0.039) (0.039)
dem60 ~ ind60 0.460 0.460 0.460
(0.100) (0.100) (0.100)
Num.Obs. 75 75 75
AIC 1906.2 1906.2 1906.2
BIC 1940.9 1940.9 1940.9

1. Define new parameters in the model

model_std <- '
  # latent variable definitions
    ind60 =~ x1 + x2 + x3
    dem60 =~ y1 + y2 + y3 + y4
  # latent variances
    ind60 ~~ v1 * ind60
    dem60 ~~ ev2 * dem60
  # regression
    dem60 ~ b * ind60
  # define standardized coefficients
    v2 := b^2 * v1 + ev2
    beta := b * sqrt(v1 / v2)
'
fit_std <- sem(model_std, data = PoliticalDemocracy)
parameterEstimates(fit_std)  # same standardized coefficient
     lhs op           rhs label   est    se      z pvalue ci.lower ci.upper
1  ind60 =~            x1       1.000 0.000     NA     NA    1.000    1.000
2  ind60 =~            x2       2.180 0.139 15.633  0.000    1.907    2.453
3  ind60 =~            x3       1.818 0.152 11.966  0.000    1.520    2.116
4  dem60 =~            y1       1.000 0.000     NA     NA    1.000    1.000
5  dem60 =~            y2       1.419 0.203  7.005  0.000    1.022    1.816
6  dem60 =~            y3       1.095 0.171  6.396  0.000    0.760    1.431
7  dem60 =~            y4       1.428 0.171  8.341  0.000    1.092    1.763
8  ind60 ~~         ind60    v1 0.449 0.087  5.170  0.000    0.279    0.619
9  dem60 ~~         dem60   ev2 3.453 0.875  3.947  0.000    1.738    5.168
10 dem60  ~         ind60     b 1.439 0.379  3.801  0.000    0.697    2.181
11    x1 ~~            x1       0.081 0.020  4.123  0.000    0.043    0.120
12    x2 ~~            x2       0.120 0.072  1.676  0.094   -0.020    0.261
13    x3 ~~            x3       0.467 0.091  5.153  0.000    0.289    0.644
14    y1 ~~            y1       2.404 0.516  4.663  0.000    1.394    3.415
15    y2 ~~            y2       6.552 1.289  5.082  0.000    4.025    9.078
16    y3 ~~            y3       5.363 0.996  5.384  0.000    3.410    7.315
17    y4 ~~            y4       2.137 0.719  2.973  0.003    0.728    3.546
18    v2 :=    b^2*v1+ev2    v2 4.382 1.089  4.024  0.000    2.248    6.517
19  beta := b*sqrt(v1/v2)  beta 0.460 0.100  4.593  0.000    0.264    0.657

2. Use constraints to fix variances

model_con <- '
  # latent variable definitions
    ind60 =~ NA * x1 + x2 + x3
    dem60 =~ NA * y1 + y2 + y3 + y4
  # latent variances
    ind60 ~~ 1 * ind60
    dem60 ~~ ev2 * dem60
  # regression
    dem60 ~ beta * ind60
  # constraints
    ev2 == 1 - beta^2
'
fit_con <- sem(model_con, data = PoliticalDemocracy)
parameterEstimates(fit_con)  # same standardized coefficient
     lhs op   rhs label   est    se      z pvalue ci.lower ci.upper
1  ind60 =~    x1       0.670 0.065 10.340  0.000    0.543    0.797
2  ind60 =~    x2       1.460 0.128 11.410  0.000    1.209    1.711
3  ind60 =~    x3       1.218 0.129  9.477  0.000    0.966    1.470
4  dem60 =~    y1       2.093 0.260  8.049  0.000    1.584    2.603
5  dem60 =~    y2       2.970 0.401  7.406  0.000    2.184    3.756
6  dem60 =~    y3       2.293 0.342  6.696  0.000    1.622    2.964
7  dem60 =~    y4       2.989 0.315  9.494  0.000    2.372    3.606
8  ind60 ~~ ind60       1.000 0.000     NA     NA    1.000    1.000
9  dem60 ~~ dem60   ev2 0.788 0.092  8.535  0.000    0.607    0.969
10 dem60  ~ ind60  beta 0.460 0.100  4.593  0.000    0.264    0.657
11    x1 ~~    x1       0.081 0.020  4.123  0.000    0.043    0.120
12    x2 ~~    x2       0.120 0.072  1.676  0.094   -0.020    0.261
13    x3 ~~    x3       0.467 0.091  5.153  0.000    0.289    0.644
14    y1 ~~    y1       2.404 0.516  4.663  0.000    1.394    3.415
15    y2 ~~    y2       6.552 1.289  5.082  0.000    4.025    9.078
16    y3 ~~    y3       5.363 0.996  5.384  0.000    3.410    7.315
17    y4 ~~    y4       2.137 0.719  2.973  0.003    0.729    3.546

3. (Incorrect) constraining the first indicators

coef(fit_con)[c("ind60=~x1", "dem60=~y1")]
ind60=~x1 dem60=~y1 
0.6697334 2.0934489 
model_conl <- '
  # latent variable definitions
    ind60 =~ 0.6697334 * x1 + x2 + x3
    dem60 =~ 2.0934489 * y1 + y2 + y3 + y4
  # regression
    dem60 ~ beta * ind60
'
fit_conl <- sem(model_conl, data = PoliticalDemocracy)
parameterEstimates(fit_conl)  # same standardized coefficient
     lhs op   rhs label   est    se      z pvalue ci.lower ci.upper
1  ind60 =~    x1       0.670 0.000     NA     NA    0.670    0.670
2  ind60 =~    x2       1.460 0.093 15.633  0.000    1.277    1.643
3  ind60 =~    x3       1.218 0.102 11.966  0.000    1.018    1.417
4  dem60 =~    y1       2.093 0.000     NA     NA    2.093    2.093
5  dem60 =~    y2       2.970 0.424  7.005  0.000    2.139    3.801
6  dem60 =~    y3       2.293 0.359  6.396  0.000    1.590    2.996
7  dem60 =~    y4       2.989 0.358  8.341  0.000    2.286    3.691
8  dem60  ~ ind60  beta 0.460 0.121  3.801  0.000    0.223    0.698
9     x1 ~~    x1       0.081 0.020  4.123  0.000    0.043    0.120
10    x2 ~~    x2       0.120 0.072  1.676  0.094   -0.020    0.261
11    x3 ~~    x3       0.467 0.091  5.153  0.000    0.289    0.644
12    y1 ~~    y1       2.404 0.516  4.663  0.000    1.394    3.415
13    y2 ~~    y2       6.552 1.289  5.082  0.000    4.025    9.078
14    y3 ~~    y3       5.363 0.996  5.384  0.000    3.410    7.315
15    y4 ~~    y4       2.137 0.719  2.973  0.003    0.728    3.546
16 ind60 ~~ ind60       1.000 0.193  5.170  0.000    0.621    1.379
17 dem60 ~~ dem60       0.788 0.200  3.947  0.000    0.397    1.179
msummary(list(`Define new par` = fit_std,
              `Constraints` = fit_con,
              `Fix loading` = fit_conl))
Define new par Constraints Fix loading
ind60 =~ x1 1.000 0.670 0.670
(0.000) (0.065) (0.000)
ind60 =~ x2 2.180 1.460 1.460
(0.139) (0.128) (0.093)
ind60 =~ x3 1.818 1.218 1.218
(0.152) (0.129) (0.102)
dem60 =~ y1 1.000 2.093 2.093
(0.000) (0.260) (0.000)
dem60 =~ y2 1.419 2.970 2.970
(0.203) (0.401) (0.424)
dem60 =~ y3 1.095 2.293 2.293
(0.171) (0.342) (0.359)
dem60 =~ y4 1.428 2.989 2.989
(0.171) (0.315) (0.358)
dem60 ~ ind60 1.439 0.460 0.460
(0.379) (0.100) (0.121)
v2 × = b^2*v1+ev2 4.382
(1.089)
beta × = b*sqrt(v1/v2) 0.460
(0.100)
Num.Obs. 75 75 75
AIC 1906.2 1906.2 1906.2
BIC 1940.9 1940.9 1940.9

As can be shown, while the first two methods give the same standardized beta in terms of estimates and standard errors, the third does not as it does not account for the uncertainty in the standard deviation.