Concepedia

Abstract

Predictive models are susceptible to errors called unknown unknowns, in which the model assigns incorrect labels to instances with high confidence. These commonly arise when training data does not represent variations of a class encountered at model deployment. Prior work showed that crowd workers can identify instances of unknown unknowns, but asking the crowd to identify a sufficient number of individual instances can be costly to acquire [2]. Instead, this paper presents an approach that leverages people’s ability to find patterns to retrain classifiers more effectively with fewer examples. We ask crowd workers to suggest and verify patterns in unknown unknowns. We then use these patterns to train an expansion classifier to identify additional examples from existing data that the primary classifier has encountered (and potentially misclassified) in the past. Our experiments show that our approach outperforms existing unknown unknown detection methods at improving classifier performance. This work is the first to leverage crowds to identify error patterns in large datasets to improve ML training.

References

YearCitations

Page 1