The identification of stages in diachronic data: variability-based neighbour clustering

Abstract

In this paper, we introduce a data-driven bottom-up clustering method for the identification of stages in diachronic corpus data that differ from each other quantitatively. Much like regular approaches to hierarchical clustering, it is based on identifying and merging the most cohesive groups of data points, but, unlike regular approaches to clustering, it allows for the merging of temporally adjacent data, thus, in effect, preserving the chronological order. We exemplify the method with two case studies, one on verbal complementation of shall, the other on the development of the perfect in English.

References

Page 1

	Year	Citations

Page 1