Publication | Closed Access
Equalization Loss for Long-Tailed Object Recognition
465
Citations
39
References
2020
Year
Unknown Venue
Convolutional Neural NetworkEngineeringMachine LearningImage ClassificationImage AnalysisData SciencePattern RecognitionLong-tail LearningEqualization LossVideo TransformerEffective Equalization LossMachine VisionFeature LearningObject DetectionComputer ScienceDeep LearningComputer VisionObject RecognitionConvolutional Neural Networks
Convolutional neural networks have excelled at object recognition, yet state‑of‑the‑art detectors still struggle on large‑vocabulary, long‑tailed datasets such as LVIS. The study aims to address long‑tailed recognition by treating each positive sample as a negative for other classes and introducing an equalization loss that suppresses discouraging gradients for rare categories. The equalization loss is a lightweight modification that masks gradients for rare classes during back‑propagation, thereby preventing their learning from being disadvantaged. The method yields 4.1% and 4.8% AP improvements on rare and common categories, respectively, and secured first place in the LVIS 2019 challenge. Code is available at https://github.com/tztztztztz/eql.detectron2.
Object recognition techniques using convolutional neural networks (CNN) have achieved great success. However, state-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets, e.g. LVIS. In this work, we analyze this problem from a novel perspective: each positive sample of one category can be seen as a negative sample for other categories, making the tail categories receive more discouraging gradients. Based on it, we propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories. The equalization loss protects the learning of rare categories from being at a disadvantage during the network parameter updating. Thus the model is capable of learning better discriminative features for objects of rare classes. Without any bells and whistles, our method achieves AP gains of 4.1% and 4.8% for the rare and common categories on the challenging LVIS benchmark, compared to the Mask R-CNN baseline. With the utilization of the effective equalization loss, we finally won the 1st place in the LVIS Challenge 2019. Code has been made available at: https://github.com/tztztztztz/eql.detectron2.
| Year | Citations | |
|---|---|---|
Page 1
Page 1