Semi-Supervised Consensus Labeling for Crowdsourcing

Abstract

Because individual crowd workers often exhibit high variance in annotation accuracy, we often ask multiple crowd workers to label each example to infer a single consensus label. While simple majority vote computes consensus by equally weighting each worker’s vote, weighted voting assigns greater weight to more accurate workers, where accuracy is estimated by inner-annotator agreement (unsupervised) and/or agreement with known expert labels (supervised). In this paper, we investigate the annotation cost vs. consensus accuracy benefit from increasing the amount of expert supervision. To maximize benefit from supervision, we propose a semi-supervised approach which infers consensus labels using both labeled and unlabeled examples. We compare our semi-supervised approach with several existing unsupervised and supervised baselines, evaluating on both synthetic data and Amazon Mechanical Turk data. Results show (a) a very modest amount of supervision can provide significant benefit, and (b) consensus accuracy from full supervision with a large amount of labeled data is matched by our semi-supervised approach with much less supervision.

References

Page 1

	Year	Citations

Page 1