Supervised classification with temporal data

Abstract

In supervised classification there is a categorical dependent variable y and a set of independent variables (the predictors). which may be represented by a vector X. Given training data in the form of N observations $(X\sb1,y\sb1)\...(X\sb{N},y\sb{N}),$ the objective is to build a model that accurately predicts the values of y for unseen values of X. This dissertation examines a special case of the supervised classification problem. In the problem considered, predictor measurements have a natural ordering; in particular, an ordering due to a dependence with a special variable such as time. An observation consists of a finite series of measurements rather than an unordered set. The dissertation's thesis is that the information encapsulated in the ordering of predictors can be used to improve both the accuracy and understandability of classification models. Traditional induction algorithms are not designed to take advantage of this information. This dissertation explores ways that enable classifiers to do so. The ordering of predictors enables the characterization of observations over intervals of time, in terms of trends, effectively reducing the dimensionality of the input space. Questions relating to the representation, extraction, and evaluation of such features are addressed. A method is proposed to compress, segment, and smooth 2D scatterplots, using minimum description length estimation. The concept of line episode is introduced to capture local trend characteristics. Line episodes can be extracted and used effectively by any classifier using trend-episode analysis, an algorithm presented in the thesis. Trends can be described in various scales, with varying levels of detail. The role of scale in the induction of classification models is explored, and a local scaling algorithm for line episodes is developed. Trend-episode analysis is extended to enable any classifier to identify appropriate features at appropriate scales and induce classification models for instances with multiscale representations. The proposed framework is evaluated on synthetic data, under a wide range of carefully controlled experimental conditions, and on a NASA telemetry monitoring application. The results show that multiscale trend-episode analysis can increase both the separability of classes (leading to more accurate models) and the understandability of models.