Publication | Closed Access
Treatments of Missing Data: A Monte Carlo Comparison of RBHDI, Iterative Stochastic Regression Imputation, and Expectation-Maximization
321
Citations
21
References
2000
Year
EngineeringData ScienceIncompletenessMonte Carlo ComparisonEstimation StatisticStatistical FoundationMonte Carlo InvestigationModel MisspecificationStochastic Regression ImputationData TreatmentData PreparationStatistical InferenceData AnalyticsStatisticsSemi-nonparametric Estimation
The study conducts a Monte Carlo comparison of four missing‑data methods. Synthetic datasets were generated under a single structured model with varying sample sizes, distributions, and missingness proportions, and the four methods—resemblance‑based hot‑deck, iterated stochastic regression, structured‑model EM, and saturated‑model EM—were applied to assess reconstruction accuracy and variance‑covariance recovery. Expectation‑maximization methods outperformed the others across all conditions, and practical issues such as model misspecification, convergence failure, and data sharing were discussed.
This article describes a Monte Carlo investigation of 4 methods for treating incomplete data. Data sets conforming to a single structured model, but varying in sample size, distributional characteristics, and proportion of data deleted, were randomly produced. Resemblance-based hot-deck imputation, iterated stochastic regression imputation, structured-model expectation-maximization, and saturated-model expectation-maximization were applied to these data sets, and these methods were then compared in terms of their ability to reconstruct the original data, the intact-data variances and covariances, and the population variances and covariances. The results favored the expectation-maximization methods, regardless of sample size, proportion of data missing, and distributional characteristics of the data. The results are discussed with respect to practical considerations in the choice of missing-data treatment, including the possibilities of model misspecification, convergence failure, and the need to make data available to other investigators.
| Year | Citations | |
|---|---|---|
Page 1
Page 1