Publication | Closed Access
An Evaluation of Two-Step Techniques for Positive-Unlabeled Learning in Text Classification
33
Citations
7
References
2014
Year
Artificial IntelligenceEngineeringMachine LearningPu Learning ProblemCorpus LinguisticsText MiningNatural Language ProcessingClassification MethodInformation RetrievalData ScienceData MiningPattern RecognitionComputational LinguisticsReliable Negative DocumentsDocument ClassificationText ClassificationLanguage StudiesSemi-supervised LearningSupervised LearningPositive-unlabeled LearningAutomatic ClassificationTwo-step TechniquesKnowledge DiscoveryIntelligent ClassificationComputer ScienceTow StepsLinguistics
Positive-unlabeled (PU) learning is a learning problem which uses a semi-supervised method for learning. In PU learning problem, the aim is to build an accurate binary classifier without the need to collect negative examples for training. Two-step approach is a solution for PU learning problem that consists of tow steps: (1) Identifying a set of reliable negative documents. (2) Building a classifier iteratively. In this paper we evaluate five combinations of techniques for two-step strategy. We found that using Rocchio method in step 1 and Expectation-Maximization method in step 2 is most effective combination in our experiments.
| Year | Citations | |
|---|---|---|
Page 1
Page 1