Time series shapelets: a novel technique that allows accurate, interpretable and fast classification

TLDR

Time‑series classification has attracted great interest, yet the nearest‑neighbor algorithm—though hard to beat—requires storing the entire dataset and offers little interpretability, limiting its use on resource‑constrained sensors. This work introduces time‑series shapelets, a new primitive designed to overcome these limitations. Shapelets are representative subsequences of a class, and classification is performed by measuring distance to these shapelets instead of to the nearest neighbor. Extensive empirical evaluations across diverse domains demonstrate that shapelet‑based classifiers are interpretable, more accurate, and significantly faster than state‑of‑the‑art methods.

Abstract

Classification of time series has been attracting great interest over the past decade. While dozens of techniques have been introduced, recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems, especially for large-scale datasets. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a high time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data and to make the classification result more explainable, which global characteristics of the nearest neighbor cannot provide. In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. We can use the distance to the shapelet, rather than the distance to the nearest neighbor to classify objects. As we shall show with extensive empirical evaluations in diverse domains, classification algorithms based on the time series shapelet primitives can be interpretable, more accurate, and significantly faster than state-of-the-art classifiers.

References

Page 1

	Year	Citations

Page 1