Bayes not Bust! Why Simplicity is no Problem for Bayesians

Abstract

The advent of formal definitions of the simplicity of a theory has important implications for model selection. But what is the best way to define simplicity? Forster and Sober ([1994]) advocate the use of Akaike's Information Criterion (AIC), a non-Bayesian formalisation of the notion of simplicity. This forms an important part of their wider attack on Bayesianism in the philosophy of science. We defend a Bayesian alternative: the simplicity of a theory is to be characterised in terms of Wallace's Minimum Message Length (MML). We show that AIC is inadequate for many statistical problems where MML performs well. Whereas MML is always defined, AIC can be undefined. Whereas MML is not known ever to be statistically inconsistent, AIC can be. Even when defined and consistent, AIC performs worse than MML on small sample sizes. MML is statistically invariant under 1-to-1 re-parametrisation, thus avoiding a common criticism of Bayesian approaches. We also show that MML provides answers to many of Forster's objections to Bayesianism. Hence an important part of the attack on Bayesianism fails. 1. Introduction2. The Curve Fitting Problem2.1 Curves and families of curves2.2 Noise2.3 The method of Maximum Likelihood2.4 ML and over-fitting3. Akaike's Information Criterion (AIC)4. The Predictive Accuracy Framework5. The Minimum Message Length (MML) Principle5.1 The Strict MML estimator5.2 An example: The binomial distribution5.3 Properties of the SMML estimator5.3.1 Bayesianism5.3.2 Language invariance5.3.3Generality5.3.4 Consistency and efficiency5.4 Similarity to false oracles5.5 Approximations to SMML6. Criticisms of AIC6.1 Problems with ML6.1.1 Small sample bias in a Gaussian distribution6.1.2 The von Mises circular and von Mises—Fisher spherical distributions6.1.3 The Neyman–Scott problem6.1.4 Neyman–Scott, predictive accuracy and minimum expected KL distance6.2 Other problems with AIC6.2.1 Univariate polynomial regression6.2.2 Autoregressive econometric time series6.2.3 Multivariate second-order polynomial model selection6.2.4 Gap or no gap: a clustering-like problem for AIC6.3 Conclusions from the comparison of MML and AIC7. Meeting Forster's objections to Bayesianism7.1 The sub-family problem7.2 The problem of approximation, or, which framework for statistics?8. ConclusionA. Details of the derivation of the Strict MML estimatorB. MML, AIC and the Gap vs. No Gap ProblemB.1 Expected size of the largest gapB.2 Performance of AIC on the gap vs. no gap problemB.3 Performance of MML in the gap vs. no gap problem

References

Page 1

	Year	Citations

Page 1