Sometimes analytic solution does not exist
With MC, one simulates the process of generating the data with an assumed data generating model
rnorm(5, mean = 0, sd = 1)
## [1] 0.8863733 1.6361050 -1.3694538 -1.1621330 1.1365392
rnorm(5, mean = 0, sd = 1) # number changed
## [1] 0.6607360 -0.7283291 0.5751322 1.4180376 -1.5442535
set.seed(1)rnorm(5, mean = 0, sd = 1)
## [1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078
set.seed(1)rnorm(5, mean = 0, sd = 1) # same seed, same number
## [1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078
rnorm(n, mean, sd) # Normal distribution (mean and SD)runif(n, min, max) # Uniform distribution (minimum and maximum)rchisq(n, df) # Chi-squared distribution (degrees of freedom)rbinom(n, size, prob) # Binomial distribution
Other distributions include exponential
, gamma
, beta
, t
, F
library(tidyverse)set.seed(123)nsim <- 20 # 20 samplessam <- rnorm(nsim) # default is mean = 0 and sd = 1ggplot(tibble(x = sam), aes(x = x)) + geom_density(bw = "SJ") + stat_function(fun = dnorm, col = "red") # overlay normal curve in red
Try increasing nsim
to 100, then 1,000
Simulating Means and Medians
Experiment | Simulation |
---|---|
Independent variables | Design factors |
Experimental conditions | Simulation conditions |
Controlled variables | Other parameters |
Procedure/Manipulation | Data generating model |
Dependent variables | Evaluation criteria |
Substantive theory | Statistical theory |
Participants | Replications |
(Sigal and Chalmers, 2016, Figure 1, p. 141)
Like experimental designs, conditions should be carefully chosen
Full Factorial designs are most commonly used
Other alternatives include fractional factorial, random levels, etc
Analyze the simulated data using one or more analytic approaches
For evaluating estimators:
For uncertainty estimators
Combining bias and efficiency
For statistical inferences:
Criterion | Cutoff | Citation |
---|---|---|
Bias | . | . |
Relative bias | ≤5% | Hoogland and Boomsma (1998) |
Standardized bias | ≤.40 | Collins, Schafer, and Kam (2001) |
SE bias | . | . |
Relative SE bias | ≤10% | Hoogland and Boomsma (1998) |
MSE | . | . |
RMSE | . | . |
Empirical Type I error (α = .05) | 2.5% - 7.5% | Bradley (1978) |
Power | . | . |
95% CI Coverage | 91%-98% | Muthén and Muthén (2002) |
Just like you're analyzing real data
Simulation Example on Structural Equation Modeling
Should be justified rather than relying on rule of thumbs
E.g., if one wants the MC error to be ≤2.5% of the sampling variability, R needs to be 1 / .0252 = 1,600
For power (also Type I error) and CI coverage,
E.g., with R = 250, and empirical Type I error = 5%,
sqrt((.05 * (1 - .05)) / 250)
## [1] 0.01378405
So R should be increase for more precise estimates
(Boomsma, 2013, Table 1, p. 521)
See Boomsma (2013), Table 2, p. 526 for a checklist
future
package)Carsey and Harden (2014) for a gentle introduction
Chalmers (2019) and Sigal and Chalmers (2016) for using the R
package SimDesign
Harwell, Kohli, and Peralta-Torres (2018) for a review of design and reporting practices
Skrondal (2000), Serlin (2000), and Bandalos and Leite (2013) for additional topics
Bandalos, D. L. and W. Leite (2013). "Use of Monte Carlo studies in structural equation modeling research". In: Structural equation modeling. A second course. Ed. by G. R. Hancock and R. O. Mueller. 2nd ed. Charlotte, NC: Information Age, pp. 625-666.
Boomsma, A. (2013). "Reporting Monte Carlo studies in structural equation modeling". In: Structural Equation Modeling. A Multidisciplinary Journal 20, pp. 518-540. DOI: 10.1080/10705511.2013.797839.
Bradley, J. V. (1978). "Robustness?" In: British Journal of Mathematical and Statistical Psychology 31, pp. 144-152. DOI: 10.1111/j.2044-8317.1978.tb00581.x.
Carsey, T. M. and J. J. Harden (2014). Monte Carlo Simulation and resampling. Methods for social science. Thousand Oaks, CA: Sage.
Chalmers, P. (2019). SimDesign: Structure for Organizing Monte Carlo Simulation Designs. R package version 1.13. URL: https://CRAN.R-project.org/package=SimDesign.
Collins, L. M, J. L. Schafer, and C. Kam (2001). "A comparison of inclusive and restrictive strategies in modern missing data procedures". In: Psychological Methods 6, pp. 330-351. DOI: 10.1037//1082-989X.6.4.330.
Harwell, M, N. Kohli, and Y. Peralta-Torres (2018). "A survey of reporting practices of computer simulation studies in statistical research". In: The American Statistician 72, pp. 321-327. ISSN: 0003-1305. DOI: 10.1080/00031305.2017.1342692.
Hoogland, J. J. and A. Boomsma (1998). "Robustness studies in covariance structure modeling". In: Sociological Methods & Research 26, pp. 329-367. DOI: 10.1177/0049124198026003003.
Muthén, L. K. and B. O. Muthén (2002). "How to use a Monte Carlo study to decide on sample size and determine power". In: Structural Equation Modeling 9, pp. 599-620. DOI: 10.1207/S15328007SEM0904_8.
Serlin, R. C. (2000). "Testing for robustness in Monte Carlo studies". In: Psychological Methods 5, pp. 230-240. DOI: 10.1037//1082-989X.5.2.230.
Sigal, M. J. and R. P. Chalmers (2016). "Play it again: Teaching statistics with Monte Carlo simulation". In: Journal of Statistics Education 24.3, pp. 136-156. ISSN: 1069-1898. DOI: 10.1080/10691898.2016.1246953. URL: https://doi.org/10.1080/10691898.2016.1246953 https://www.tandfonline.com/doi/full/10.1080/10691898.2016.1246953.
Skrondal, A. (2000). "Design and analysis of Monte Carlo experiments". In: Multivariate Behavioral Research 35, pp. 137-167. DOI: 10.1207/S15327906MBR3502_1.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |