Publication | Closed Access
Best of both worlds: Human-machine collaboration for object annotation
209
Citations
57
References
2015
Year
Unknown Venue
Artificial IntelligenceData AnnotationScene AnalysisEngineeringMachine LearningObject AnnotationsNatural Language ProcessingImage AnalysisData SciencePattern RecognitionRobot LearningMachine VisionObject DetectionComputer ScienceDeep LearningComputer VisionAnnotation ToolObject RecognitionScene UnderstandingHuman-computer InteractionObject AnnotationAnnotationScene ModelingAutomatic Annotation
Localizing every object in an image remains elusive, manual annotation is costly, and current detectors can reliably detect only a few objects per image. This work proposes a principled framework that combines state‑of‑the‑art object detection with crowd‑engineering techniques to accurately and efficiently localize objects. The system takes an image and desired precision, utility, or human‑cost constraints, and outputs annotations generated by a Markov Decision Process that seamlessly integrates multiple computer‑vision models with diverse human inputs. Experiments on the ILSVRC2014 dataset demonstrate the effectiveness of this human‑in‑the‑loop labeling approach.
The long-standing goal of localizing every object in an image remains elusive. Manually annotating objects is quite expensive despite crowd engineering innovations. Current state-of-the-art automatic object detectors can accurately detect at most a few objects per image. This paper brings together the latest advancements in object detection and in crowd engineering into a principled framework for accurately and efficiently localizing objects in images. The input to the system is an image to annotate and a set of annotation constraints: desired precision, utility and/or human cost of the labeling. The output is a set of object annotations, informed by human feedback and computer vision. Our model seamlessly integrates multiple computer vision models with multiple sources of human input in a Markov Decision Process. We empirically validate the effectiveness of our human-in-the-loop labeling approach on the ILSVRC2014 object detection dataset.
| Year | Citations | |
|---|---|---|
2017 | 75.5K | |
2015 | 46.2K | |
2015 | 39.5K | |
2014 | 31.2K | |
2009 | 19K | |
2009 | 10K | |
2010 | 3.1K | |
2008 | 1.4K | |
2014 | 1.4K | |
2010 | 1.4K |
Page 1
Page 1