Learning a Metric for Code Readability

TLDR

The study investigates how code readability relates to software quality and its implications for language design and engineering practice. Using annotations from 120 reviewers, the authors identify local code features predictive of readability, build an automated metric, and evaluate it on over 2.2 million lines across multiple releases. The resulting metric predicts readability 80 % as accurately as humans, strongly correlates with defect‑related quality metrics, and shows that blank lines matter more than comments for local readability judgments.

Abstract

In this paper, we explore the concept of code readability and investigate its relation to software quality. With data collected from 120 human annotators, we derive associations between a simple set of local code features and human notions of readability. Using those features, we construct an automated readability measure and show that it can be 80 percent effective and better than a human, on average, at predicting readability judgments. Furthermore, we show that this metric correlates strongly with three measures of software quality: code changes, automated defect reports, and defect log messages. We measure these correlations on over 2.2 million lines of code, as well as longitudinally, over many releases of selected projects. Finally, we discuss the implications of this study on programming language design and engineering practice. For example, our data suggest that comments, in and of themselves, are less important than simple blank lines to local judgments of readability.

References

Page 1

	Year	Citations

Page 1