Concepedia

Publication | Closed Access

Best of both worlds: Human-machine collaboration for object annotation

209

Citations

57

References

2015

Year

TLDR

Localizing every object in an image remains elusive, manual annotation is costly, and current detectors can reliably detect only a few objects per image. This work proposes a principled framework that combines state‑of‑the‑art object detection with crowd‑engineering techniques to accurately and efficiently localize objects. The system takes an image and desired precision, utility, or human‑cost constraints, and outputs annotations generated by a Markov Decision Process that seamlessly integrates multiple computer‑vision models with diverse human inputs. Experiments on the ILSVRC2014 dataset demonstrate the effectiveness of this human‑in‑the‑loop labeling approach.

Abstract

The long-standing goal of localizing every object in an image remains elusive. Manually annotating objects is quite expensive despite crowd engineering innovations. Current state-of-the-art automatic object detectors can accurately detect at most a few objects per image. This paper brings together the latest advancements in object detection and in crowd engineering into a principled framework for accurately and efficiently localizing objects in images. The input to the system is an image to annotate and a set of annotation constraints: desired precision, utility and/or human cost of the labeling. The output is a set of object annotations, informed by human feedback and computer vision. Our model seamlessly integrates multiple computer vision models with multiple sources of human input in a Markov Decision Process. We empirically validate the effectiveness of our human-in-the-loop labeling approach on the ILSVRC2014 object detection dataset.

References

YearCitations

2017

75.5K

2015

46.2K

2015

39.5K

2014

31.2K

2009

19K

2009

10K

2010

3.1K

2008

1.4K

2014

1.4K

2010

1.4K

Page 1