Methods for Exact Goodness-of-Fit Tests

Abstract

Abstract Numerous goodness-of-fit tests with asymptotic chi-squared distributions have been proposed for discrete multivariate data, and there has been much discussion about using asymptotic results for computing critical values when there are small expected cell values. Although exact methods would be preferred in these situations, it generally is believed that such methods are computationally intractable. We propose methods for calculating exact distributions and significance levels for goodness-of-fit statistics that are computationally feasible over a wide range of models. In particular, the distribution for a simple multinomial model can be evaluated in polynomial time. For composite null hypotheses, we calculate the distribution conditional on the sufficient statistics for the nuisance parameters. We calculate the characteristic function of a distribution and invert the characteristic function using the fast Fourier transform (FFT). Our approach emphasizes the relationship between exact methods and probability formulas. Our technique, transforming the domain of the problem, is interesting for two reasons: First, algorithms that use the FFT and the convolution theorem are efficient for calculating the distribution of sums of independent statistics; and second, less storage is needed when working in the frequency domain than in the probability domain. The algorithms can be applied to general goodness-of-fit statistics and are parallelizable. Key Words: Conditional testsContingency tablesDiscrete Fourier transformLikelihood ratio statisticMultinomial goodness-of-fitParallelizable algorithmsPearson's X 2 statisticPower divergence statistic

References

Page 1

	Year	Citations

Page 1