Concepedia

Publication | Closed Access

Thirteen Ways to Look at the Correlation Coefficient

2.9K

Citations

22

References

1988

Year

TLDR

The article traces the historical development of correlation, noting Sir Francis Galton’s 1885 introduction of regression and Karl Pearson’s 1900 formulation of the correlation coefficient, marking the centennial of Galton’s work. It presents thirteen distinct formulas for the correlation coefficient, each offering a unique computational and conceptual perspective. Each formula is derived from algebraic, geometric, or trigonometric frameworks, illustrating diverse conceptual interpretations of r. The study demonstrates that Pearson’s r can be interpreted as a special mean, a special variance, a ratio of means or variances, the slope of a regression line, the cosine of an angle, the tangent to an ellipse, and other intriguing perspectives.

Abstract

Abstract In 1885, Sir Francis Galton first defined the term "regression" and completed the theory of bivariate correlation. A decade later, Karl Pearson developed the index that we still use to measure correlation, Pearson's r. Our article is written in recognition of the 100th anniversary of Galton's first discussion of regression and correlation. We begin with a brief history. Then we present 13 different formulas, each of which represents a different computational and conceptual definition of r. Each formula suggests a different way of thinking about this index, from algebraic, geometric, and trigonometric settings. We show that Pearson's r (or simple functions of r) may variously be thought of as a special type of mean, a special type of variance, the ratio of two means, the ratio of two variances, the slope of a line, the cosine of an angle, and the tangent to an ellipse, and may be looked at from several other interesting perspectives.

References

YearCitations

Page 1