Large-Sample Learning of Bayesian Networks is NP-Hard

TLDR

The paper presents new complexity results for learning discrete‑variable Bayesian networks from data. The authors analyze algorithms that use a consistent, simplicity‑favoring scoring criterion on sufficiently large datasets, and they construct local distributions that make the joint distribution perfectly representable by the network structure. They prove that finding high‑scoring Bayesian network structures is NP‑hard, even when the generative distribution is perfect, or when independence, inference, or information oracles are available, or when parent limits are imposed.

Abstract

In this paper, we provide new complexity results for algorithms that learn discrete-variable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest structure for which the model is able to represent the generative distribution exactly. Our results therefore hold whenever the learning algorithm uses a consistent scoring criterion and is applied to a sufficiently large dataset. We show that identifying high-scoring structures is NP-hard, even when any combination of one or more of the following hold: the generative distribution is perfect with respect to some DAG containing hidden variables; we are given an independence oracle; we are given an inference oracle; we are given an information oracle; we restrict potential solutions to structures in which each node has at most k parents, for all k>=3.Our proof relies on a new technical result that we establish in the appendices. In particular, we provide a method for constructing the local distributions in a Bayesian network such that the resulting joint distribution is provably perfect with respect to the structure of the network.

References

Page 1

	Year	Citations

Page 1