Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation

TLDR

The Generalized Local Learning (GLL) framework can be instantiated in many ways, yielding both existing state‑of‑the‑art and novel algorithms. The authors propose GLL as an algorithmic framework for learning local causal structure—direct causes, effects, and Markov blankets—around target variables in large datasets with small samples, and in a companion paper they extend it for scalable global causal graph learning. They evaluate multiple GLL‑derived algorithms on real and simulated data, comparing predictivity, feature‑set parsimony, and causal neighborhood induction against other local causal discovery and non‑causal feature‑selection methods. Experiments show that local causal feature‑selection methods are sound, highly predictive, parsimonious, and causally interpretable, whereas non‑causal methods lack causal interpretability, leading the authors to recommend only local causal techniques for causal insight.

Abstract

We present an algorithmic framework for learning local causal structure around target variables of interest in the form of direct causes/effects and Markov blankets applicable to very large data sets with relatively small samples. The selected feature sets can be used for causal discovery and classification. The framework (Generalized Local Learning, or GLL) can be instantiated in numerous ways, giving rise to both existing state-of-the-art as well as novel algorithms. The resulting algorithms are sound under well-defined sufficient conditions. In a first set of experiments we evaluate several algorithms derived from this framework in terms of predictivity and feature set parsimony and compare to other local causal discovery methods and to state-of-the-art non-causal feature selection methods using real data. A second set of experimental evaluations compares the algorithms in terms of ability to induce local causal neighborhoods using simulated and resimulated data and examines the relation of predictivity with causal induction performance. Our experiments demonstrate, consistently with causal feature selection theory, that local causal feature selection methods (under broad assumptions encompassing appropriate family of distributions, types of classifiers, and loss functions) exhibit strong feature set parsimony, high predictivity and local causal interpretability. Although non-causal feature selection methods are often used in practice to shed light on causal relationships, we find that they cannot be interpreted causally even when they achieve excellent predictivity. Therefore we conclude that only local causal techniques should be used when insight into causal structure is sought. In a companion paper we examine in depth the behavior of GLL algorithms, provide extensions, and show how local techniques can be used for scalable and accurate global causal graph learning.

References

Page 1

	Year	Citations

Page 1