Probabilistic Inference Using Markov Chain Monte Carlo Methods

TLDR

Probabilistic inference is a powerful tool for uncertain reasoning in AI, but realistic models create high‑dimensional complex distributions that are computationally challenging, prompting the use of Markov chain Monte Carlo techniques such as Metropolis, Gibbs, hybrid MC, simulated annealing, and approximate counting methods from other fields. This review surveys the role of probabilistic inference in AI, explains Markov chain theory, and catalogs a wide range of MCMC algorithms and related techniques. Illustrative applications include expert‑system inference, latent‑class discovery from data, and Bayesian learning for neural networks.

Abstract

Probabilistic inference is an attractive approach to uncertain reasoning and empirical learning in artificial intelligence. Computational difficulties arise, however, because probabilistic models with the necessary realism and flexibility lead to complex distributions over high-dimensional spaces. Related problems in other fields have been tackled using Monte Carlo methods based on sampling using Markov chains, providing a rich array of techniques that can be applied to problems in artificial intelligence. The “Metropolis algorithm” has been used to solve difficult problems in statistical physics for over forty years, and, in the last few years, the related method of “Gibbs sampling” has been applied to problems of statistical inference. Concurrently, an alternative method for solving problems in statistical physics by means of dynamical simulation has been developed as well, and has recently been unified with the Metropolis algorithm to produce the “hybrid Monte Carlo” method. In computer science, Markov chain sampling is the basis of the heuristic optimization technique of “simulated annealing”, and has recently been used in randomized algorithms for approximate counting of large sets. In this review, I outline the role of probabilistic inference in artificial intelligence, present the theory of Markov chains, and describe various Markov chain Monte Carlo algorithms, along with a number of supporting techniques. I try to present a comprehensive picture of the range of methods that have been developed, including techniques from the varied literature that have not yet seen wide application in artificial intelligence, but which appear relevant. As illustrative examples, I use the problems of probabilistic inference in expert systems, discovery of latent classes from data, and Bayesian learning for neural networks.

References

Page 1

	Year	Citations

Page 1