Testing a Point Null Hypothesis: The Irreconcilability of<i>P</i>Values and Evidence

TLDR

The problem of testing a point null hypothesis, or a small interval null, is examined, noting that a small P value does not necessarily indicate strong evidence against the null. The authors investigate the relationship between the P value and conditional and Bayesian measures of evidence against the null hypothesis. They define an objective prior as giving equal weight to the two hypotheses and being symmetric and nonincreasing away from the null, showing that similar results arise with other objective definitions. The study shows that evidence against a null can differ by an order of magnitude from the P value, with a P = 0.05 yielding a posterior null probability of at least 0.30, demonstrating that P values are highly misleading measures of evidence.

Abstract

Abstract The problem of testing a point null hypothesis (or a "small interval" null hypothesis) is considered. Of interest is the relationship between the P value (or observed significance level) and conditional and Bayesian measures of evidence against the null hypothesis. Although one might presume that a small P value indicates the presence of strong evidence against the null, such is not necessarily the case. Expanding on earlier work [especially Edwards, Lindman, and Savage (1963) and Dickey (1977)], it is shown that actual evidence against a null (as measured, say, by posterior probability or comparative likelihood) can differ by an order of magnitude from the P value. For instance, data that yield a P value of .05, when testing a normal mean, result in a posterior probability of the null of at least .30 for any objective prior distribution. ('Objective' here means that equal prior weight is given the two hypotheses and that the prior is symmetric and nonincreasing away from the null; other definitions of "objective" will be seen to yield qualitatively similar results.) The overall conclusion is that P values can be highly misleading measures of the evidence provided by the data against the null hypothesis.

References

Page 1

	Year	Citations

Page 1