Concepedia

Publication | Closed Access

On the application of GP to streaming data classification tasks with label budgets

11

Citations

27

References

2014

Year

Abstract

A framework is introduced for applying GP to streaming data classification tasks under label budgets. This is a fundamental requirement if GP is going to adapt to the challenge of streaming data environments. The framework proposes three elements: a sampling policy, a data subset and a data archiving policy. The sampling policy establishes on what basis data is sampled from the stream, and therefore when label information is requested. The data subset is used to define what GP individuals evolve against. The composition of such a subset is a mixture of data forwarded under the sampling policy and historical data identified through the data archiving policy. The combination of sampling policy and the data subset achieve a decoupling between the rate at which the stream passes and the rate at which evolution commences. Benchmarking is performed on two artificial data sets with specific forms of sudden shift and gradual drift as well as a well known real-world data set.

References

YearCitations

Page 1