library(lavaan)
># This is lavaan 0.6-17
># lavaan is FREE software! Please report any bugs.
<- rep(.7, 4) # loading vector
LAM <- diag(1 - LAM^2)
THETA <- 100
N set.seed(1) # Set seed
Mark Lai
April 11, 2020
In our lab meeting I’m going to present the article by Maydeu-Olivares (2017), which talked about standard errors (SEs) of the maximum likelihood estimators (MLEs) in SEM models. As I’m also working on something relevant, and I haven’t found some good references (I’m sure there are some, just too lazy to do a deep search), I decided to write this demo to show how the SEs can be obtained in R and compared that to the lavaan
package, which automates these computations.
To keep things simple, I’ll use a one-factor model with four indicators (without mean structure):
># This is lavaan 0.6-17
># lavaan is FREE software! Please report any bugs.
I’ll simulate data that follows the normal distribution:
The model parameters are Sy
in the code), is sufficient statistic:
># [,1] [,2] [,3] [,4]
># [1,] 0.9962082 0.4808498 0.3465193 0.3733755
># [2,] 0.4808498 1.0076039 0.4208120 0.5062814
># [3,] 0.3465193 0.4208120 0.9332648 0.3749744
># [4,] 0.3733755 0.5062814 0.3749744 0.9388328
Note that I divided by lavaan
(
This is pretty straight forward in R
In maximum likelihood estimation, we find the parameter values that would maximize the above log-likelihood function (as log is monotonic). As an example, we consider these two sets of estimates:
># [1] -158.8054
># [1] -161.6866
The first set gives higher log-likelihood (i.e., less negative) value, so should be preferred. We can try many sets of values until we find the best by hand, but it is very tedious. So instead we rely on some optimizers to help us achieve that. In R, some general-purpose optimizer can be found in the package optim
.
One thing to note before we start. Usually optimizers help us find parameters that minimizes an objective function, but here we want to maximizes it. So one trick is to instead minimize the -2
So again the first set is preferred with lower -2 log-likelihood. Now we use optim
, with the “L-BFGS-B” algorithm. It requires some starting values, so I put it the values of set 1 (you can verify that the results are the same with set 2 as starting values):
opt_info <- optim(c(lam1, theta1), minus2lSigma,
lower = c(rep(-Inf, 4), rep(0, 4)),
method = "L-BFGS-B", hessian = TRUE)
# Maximum likelihood estimates:
opt_info$par
># [1] 0.6068613 0.7783132 0.5576544 0.6467392 0.6279234 0.4018284 0.6222836
># [8] 0.5205601
So in terms of point estimates, that’s it! Let’s compare the results to lavaan
:
># optim lavaan::cfa
># f=~y1 0.6068613 0.6068697
># f=~y2 0.7783132 0.7783153
># f=~y3 0.5576544 0.5576568
># f=~y4 0.6467392 0.6467406
># y1~~y1 0.6279234 0.6279174
># y2~~y2 0.4018284 0.4018292
># y3~~y3 0.6222836 0.6222836
># y4~~y4 0.5205601 0.5205593
So they are essentially the same (the difference in rounding is mainly due to different optimizer and starting values).
Now onto standard errors, which is the main focus of Maydeu-Olivares (2017). The paper also talk about sandwich estimators which are robust (to some degree) to nonnormality, but I will focus on the ones with observed and expected information first, and then show how to obtain MLM and MLR standard errors in plain R (to some degree).
Although in the statistics literature, generally observed information is preferred over the use of expected information (both of which related to the second partial derivatives of the log-likelihood function, see here), in lavaan
the default is to use the expected information, and so I will demonstrate that first.
As described in the paper, the asymptotic covariance matrix of the MLE, which estimates how the estimates vary and covary across repeated samples, has this formula:
TODO: Add the explanation on the terms
The asymptotic standard errors are the square root values of the diagonal elements in numDeriv::jacobian()
function like:
# Using parameter estimates from optim to construct Sigma
(Sigmahat <- tcrossprod(opt_info$par[1:4]) + diag(opt_info$par[5:8]))
># [,1] [,2] [,3] [,4]
># [1,] 0.9962040 0.4723281 0.3384188 0.3924810
># [2,] 0.4723281 1.0075997 0.4340297 0.5033656
># [3,] 0.3384188 0.4340297 0.9332620 0.3606569
># [4,] 0.3924810 0.5033656 0.3606569 0.9388317
# Make it into a function, outputting vec(Sigma)
vecSigma <- function(pars) {
as.vector(tcrossprod(pars[1:4]) + diag(pars[5:8]))
}
# Numeric derivative:
(Delta <- numDeriv::jacobian(vecSigma, opt_info$par))
># [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
># [1,] 1.2137226 0.0000000 0.0000000 0.0000000 1 0 0 0
># [2,] 0.7783132 0.6068613 0.0000000 0.0000000 0 0 0 0
># [3,] 0.5576544 0.0000000 0.6068613 0.0000000 0 0 0 0
># [4,] 0.6467392 0.0000000 0.0000000 0.6068613 0 0 0 0
># [5,] 0.7783132 0.6068613 0.0000000 0.0000000 0 0 0 0
># [6,] 0.0000000 1.5566263 0.0000000 0.0000000 0 1 0 0
># [7,] 0.0000000 0.5576544 0.7783132 0.0000000 0 0 0 0
># [8,] 0.0000000 0.6467392 0.0000000 0.7783132 0 0 0 0
># [9,] 0.5576544 0.0000000 0.6068613 0.0000000 0 0 0 0
># [10,] 0.0000000 0.5576544 0.7783132 0.0000000 0 0 0 0
># [11,] 0.0000000 0.0000000 1.1153087 0.0000000 0 0 1 0
># [12,] 0.0000000 0.0000000 0.6467392 0.5576544 0 0 0 0
># [13,] 0.6467392 0.0000000 0.0000000 0.6068613 0 0 0 0
># [14,] 0.0000000 0.6467392 0.0000000 0.7783132 0 0 0 0
># [15,] 0.0000000 0.0000000 0.6467392 0.5576544 0 0 0 0
># [16,] 0.0000000 0.0000000 0.0000000 1.2934784 0 0 0 1
Now, we’re ready to compute the asymptotic standard errors:
Compared to lavaan
:
># optim lavaan::cfa ('expected')
># f=~y1 0.10400371 0.10400353
># f=~y2 0.10242249 0.10242227
># f=~y3 0.10139392 0.10139383
># f=~y4 0.09991077 0.09991052
># y1~~y1 0.10889211 0.10889173
># y2~~y2 0.10757581 0.10757529
># y3~~y3 0.10420294 0.10420288
># y4~~y4 0.09956021 0.09955978
So that matches pretty well!
While Maydeu-Olivares (2017) also presented the formula (in matrix form) for the asymptotic covariance estimates using the expected information, in practice such a matrix is usually obtained based on the Hessian, which usually is part of the output in the maximization algorithm in MLE. For example, going back to the output of the optim()
call,
# As we've multiplied by -2 before, we need to divide by 2 now.
# The Hessian is not affected by multiplication of -1
acov_obs <- solve(opt_info$hessian / 2)
(ase_obs <- sqrt(diag(solve(opt_info$hessian / 2))))
># [1] 0.10381614 0.10240102 0.10189012 0.09983351 0.10862886 0.10752628 0.10480306
># [8] 0.09943057
And compare to lavaan
using the information = 'observed'
option:
fit1_obs <- lavaan::update(fit1, information = 'observed')
cbind(optim = ase_obs,
`lavaan::cfa ('obs')` = sqrt(diag(lavaan::vcov(fit1_obs))))
># optim lavaan::cfa ('obs')
># f=~y1 0.10381614 0.10381636
># f=~y2 0.10240102 0.10240110
># f=~y3 0.10189012 0.10189033
># f=~y4 0.09983351 0.09983363
># y1~~y1 0.10862886 0.10862830
># y2~~y2 0.10752628 0.10752643
># y3~~y3 0.10480306 0.10480358
># y4~~y4 0.09943057 0.09943066
So again it’s pretty close. You can try using the formula in Maydeu-Olivares (2017) to verify the results.
# Asymptotic covariance of S (4th moment)
Wi_adf <- apply(t(y) - colMeans(y), 2, tcrossprod)
adf_I <- tcrossprod(Wi_adf - as.vector(Sy)) / N
# Sandwich estimator:
Deltat_exp_I <- crossprod(Delta, exp_I)
acov_mlm <- acov_exp %*% Deltat_exp_I %*% adf_I %*%
crossprod(Deltat_exp_I, acov_exp) * N
ase_mlm <- sqrt(diag(acov_mlm))
Compare to lavaan
with estimator = 'MLR'
:
># optim lavaan::cfa ('MLM')
># f=~y1 0.12965429 0.12965371
># f=~y2 0.10522671 0.10522608
># f=~y3 0.08901649 0.08901603
># f=~y4 0.08981597 0.08981560
># y1~~y1 0.10278055 0.10278061
># y2~~y2 0.10252608 0.10252560
># y3~~y3 0.09792141 0.09792136
># y4~~y4 0.10608980 0.10608957
# Cross-product matrix (without mean structure)
# First, need the deviation/residual of covariance for each individual
# observation
Sdev <- apply(t(y) - colMeans(y), 2, tcrossprod) - as.vector(Sigmahat)
g_score <- crossprod(Sdev, invSigmahat %x% invSigmahat) / 2
xp_I <- crossprod(g_score) / N # Gamma^-1_XP
# Sandwich estimator:
acov_mlr <- acov_obs %*% crossprod(Delta, xp_I %*% Delta) %*% acov_obs * N
ase_mlr <- sqrt(diag(acov_mlr))
Compare to lavaan
with estimator = 'MLR'
:
># optim lavaan::cfa ('MLR, Hessian')
># f=~y1 0.12884960 0.12885037
># f=~y2 0.10504885 0.10504891
># f=~y3 0.08977431 0.08977434
># f=~y4 0.08883634 0.08883676
># y1~~y1 0.10203109 0.10203105
># y2~~y2 0.10139778 0.10139847
># y3~~y3 0.09785923 0.09786031
># y4~~y4 0.10566978 0.10567060