Publication | Closed Access
Clustering time series from ARMA models with clipped data
97
Citations
42
References
2004
Year
Unknown Venue
Cluster ComputingClipped SeriesDocument ClusteringEngineeringMachine LearningData ScienceData MiningSpatiotemporal DatabaseData Stream MiningKnowledge DiscoveryBinary SequencesTemporal Pattern RecognitionComputer ScienceStatisticsUnsupervised Machine LearningNonlinear Time SeriesData Modeling
Clustering time series is a problem that has applications in a wide variety of fields, and has recently attracted a large amount of research. In this paper we focus on clustering data derived from Autoregressive Moving Average (ARMA) models using k-means and k-medoids algorithms with the Euclidean distance between estimated model parameters. We justify our choice of clustering technique and distance metric by reproducing results obtained in related research. Our research aim is to assess the affects of discretising data into binary sequences of above and below the median, a process known as clipping, on the clustering of time series. It is known that the fitted AR parameters of clipped data tend asymptotically to the parameters for unclipped data. We exploit this result to demonstrate that for long series the clustering accuracy when using clipped data from the class of ARMA models is not significantly different to that achieved with unclipped data. Next we show that if the data contains outliers then using clipped data produces significantly better clusterings. We then demonstrate that using clipped series requires much less memory and operations such as distance calculations can be much faster. Finally, we demonstrate these advantages on three real world data sets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1