Publication | Open Access
Why Propensity Scores Should Not Be Used for Matching
1.6K
Citations
64
References
2019
Year
Complete RandomizationTreatment EffectDecision SciencePropensity Score MatchingPolicy AnalysisCausal InferenceData ScienceSocial MatchingBiasExperimental EconomicsStatisticsSelection BiasMatching TechniqueOutcomes ResearchApplied Social PsychologyMatching MethodsMarginal Structural ModelsPropensity ScoresAlgorithmic FairnessMatching TheoryBusinessTime-varying ConfoundingStatistical InferenceMedicine
PSM seeks to emulate a fully randomized experiment, but unlike other matching methods it ignores the large imbalance that can be removed by full blocking. We show that PSM often increases imbalance, inefficiency, model dependence, and bias—even in data already balanced it can worsen imbalance relative to the original data—so researchers should replace it with other matching methods, though propensity scores still have other useful applications.
We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.
| Year | Citations | |
|---|---|---|
Page 1
Page 1