Publication | Closed Access
Learning From Crowds
1K
Citations
19
References
2010
Year
Artificial IntelligenceData AnnotationEngineeringMachine LearningAutomatic Annotation ToolText MiningNatural Language ProcessingAbsolute Gold StandardData ScienceData MiningPattern RecognitionRobot LearningHuman ComputationMultiple AnnotatorsSemi-supervised LearningSupervised LearningKnowledge DiscoveryComputer ScienceCrowdsourcingCrowd ComputingAutomatic Annotation
Obtaining reliable labels for supervised learning is often infeasible, so practitioners rely on noisy, subjective annotations from multiple experts, whose disagreement makes learning challenging. This work proposes a probabilistic framework for supervised learning that operates without a gold standard by modeling multiple annotators’ noisy labels. The algorithm jointly estimates each annotator’s reliability and infers the underlying true labels from the noisy observations. Experiments show the method outperforms the standard majority‑vote baseline.
For many supervised learning tasks it may be infeasible (or very expensive) to obtain objective and reliable labels. Instead, we can collect subjective (possibly noisy) labels from multiple experts or annotators. In practice, there is a substantial amount of disagreement among the annotators, and hence it is of great practical interest to address conventional supervised learning problems in this scenario. In this paper we describe a probabilistic approach for supervised learning when we have multiple annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels. Experimental results indicate that the proposed method is superior to the commonly used majority voting baseline.
| Year | Citations | |
|---|---|---|
Page 1
Page 1