A Comparative Study of Various Tests for Normality

TLDR

The study examines nine normality tests, including Shapiro–Wilk, skewness, kurtosis, Kolmogorov–Smirnov, Cramer–von Mises, weighted CM, modified KS, chi‑squared, and Studentized range. The authors performed an empirical sampling study of 45 alternative distributions from 12 families and 5 sample sizes, comparing the nine tests on sensitivity, mean and variance, sample‑size dependence, and effects of parameter misspecification. Shapiro–Wilk proved the most powerful omnibus test, distance tests were generally insensitive, the Studentized range excelled for symmetric short‑tailed alternatives but not for asymmetry, combined tests improved sensitivity yet were still dominated by Shapiro–Wilk, and even samples of fewer than 20 observations could detect extreme non‑normality.

Abstract

Abstract Results are given of an empirical sampling study of the sensitivities of nine statistical procedures for evaluating the normality of a complete sample. The nine statistics are W (Shapiro and Wilk, 1965), (standard third moment), b 2 (standard fourth moment), KS (Kolmogorov-Smirnov), CM (Cramer-Von Mises), WCM (weighted CM), D (modified KS), CS (chi-squared) and u (Studentized range). Forty-five alternative distributions in twelve families and five sample sizes were studied. Results are included on the comparison of the statistical procedures in relation to groupings of the alternative distributions, on means and variances of the statistics under the various alternatives, on dependence of sensitivities on sample size, on approach to normality as measured by the W statistic within some classes of distribution, and on the effect of misspecification of parameters on the performance of the simple hypothesis test statistics. The general findings include: (i) The W statistic provides a generally superior omnibus measure of non-normality; (ii) the distance tests (KS, CM, WCM, D) are typically very insensitive; (iii) the u statistic is excellent against symmetric, especially short-tailed, distributions but has virtually no sensitivity to asymmetry; (iv) a combination of both and b 2 usually provides a sensitive judgment but even their combined performance is usually dominated by W; (v) with sensitive procedures, good indication of extreme non-normality (e.g., the exponential distribution) can be achieved with samples of size less than 20.

References

Page 1

	Year	Citations

Page 1