Iterative Learning for Reliable Crowdsourcing Systems

TLDR

Crowdsourcing systems distribute tasks to many low‑paid workers, but worker unreliability necessitates repeated assignments and aggregation methods such as majority voting to increase confidence in the results. This study formulates a general model of crowdsourcing tasks and seeks to minimize the total number of task assignments required to reach a specified overall reliability. The authors propose a novel algorithm that selects task–worker assignments and infers correct answers from the workers’ responses. Experiments show the algorithm outperforms majority voting and is asymptotically optimal when compared to an oracle that knows each worker’s reliability.

Abstract

Crowdsourcing systems, in which tasks are electronically distributed to numerous information piece-workers, have emerged as an effective paradigm for human-powered solving of large scale problems in domains such as image classification, data entry, optical character recognition, recommendation, and proofreading. Because these low-paid workers can be unreliable, nearly all crowdsourcers must devise schemes to increase confidence in their answers, typically by assigning each task multiple times and combining the answers in some way such as majority voting. In this paper, we consider a general model of such crowdsourcing tasks, and pose the problem of minimizing the total price (i.e., number of task assignments) that must be paid to achieve a target overall reliability. We give a new algorithm for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers. We show that our algorithm significantly outperforms majority voting and, in fact, is asymptotically optimal through comparison to an oracle that knows the reliability of every worker.

References

Page 1

	Year	Citations

Page 1