Publication | Closed Access
Quantification in Data Streams: Initial Results
22
Citations
14
References
2017
Year
EngineeringMachine LearningStreaming AlgorithmText MiningClassification MethodData StreamInformation RetrievalData ScienceData MiningPattern RecognitionClass ImbalanceManagementData IntegrationData ManagementStatisticsQuantification GoalPredictive AnalyticsKnowledge DiscoveryIntelligent ClassificationLearning AnalyticsComputer ScienceData Stream ManagementData Stream MiningClassificationData Streams
In the last decades, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant number of methods that can be used to solve problems in diverse tasks, more prominently in classification, prediction, and clustering. However, a relevant task known as quantification has remained largely unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. In this paper, we discuss the relevance and challenges of quantification for data streams and compare how it differs from the batch setting, in which quantification has attracted more attention from the research community. We propose an algorithm to estimate the class distribution in a data stream and frame our algorithm in the active learning framework. In addition, we define two other approaches as baseline and topline strategies for this problem. The experimental results demonstrate that our algorithm has significantly higher quantification accuracy than the baseline and almost as large as the topline while requiring a fraction of the true labels requested by the latter approach.
| Year | Citations | |
|---|---|---|
Page 1
Page 1