Publication | Closed Access
Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective
1.5K
Citations
10
References
1998
Year
EngineeringData PreparationMultiset Data AnalysisRn VersionsData ScienceManagementData IntegrationBiostatisticsStatisticsLatent Variable MethodsReliabilityMultidimensional AnalysisMultiple ImputationMultivariate Missing-data ProblemsData TreatmentStatistical InferenceListwise DeletionMultivariate AnalysisData Modeling
Multivariate analyses are often limited by missing values, and until recently analysts relied on ad hoc methods such as listwise deletion, but recent theoretical and computational advances have introduced flexible, statistically sound procedures. This article reviews the key concepts of multiple imputation, surveys available software, and illustrates their application to data from the Adolescent Alcohol Prevention Trial. Multiple imputation replaces each missing value with m > 1 plausible values, analyzes each completed dataset with standard methods, and combines results to produce estimates that incorporate missing‑data uncertainty, with recent algorithms and software enabling proper implementation in complex multivariate contexts.
Analyses of multivariate data are frequently hampered by missing values. Until recently, the only missing-data methods available to most data analysts have been relatively ad1 hoc practices such as listwise deletion. Recent dramatic advances in theoretical and computational statistics, however, have produced anew generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simulation technique that replaces each missing datum with a set of m > 1 plausible values. The rn versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997a) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from the Adolescent Alcohol Prevention Trial (Hansen & Graham, 199 I).
| Year | Citations | |
|---|---|---|
Page 1
Page 1