Publication | Open Access
Generalized Focal Loss: Learning Qualified and Distributed Bounding\n Boxes for Dense Object Detection
763
Citations
0
References
2020
Year
One-stage detector basically formulates object detection as dense\nclassification and localization. The classification is usually optimized by\nFocal Loss and the box location is commonly learned under Dirac delta\ndistribution. A recent trend for one-stage detectors is to introduce an\nindividual prediction branch to estimate the quality of localization, where the\npredicted quality facilitates the classification to improve detection\nperformance. This paper delves into the representations of the above three\nfundamental elements: quality estimation, classification and localization. Two\nproblems are discovered in existing practices, including (1) the inconsistent\nusage of the quality estimation and classification between training and\ninference and (2) the inflexible Dirac delta distribution for localization when\nthere is ambiguity and uncertainty in complex scenes. To address the problems,\nwe design new representations for these elements. Specifically, we merge the\nquality estimation into the class prediction vector to form a joint\nrepresentation of localization quality and classification, and use a vector to\nrepresent arbitrary distribution of box locations. The improved representations\neliminate the inconsistency risk and accurately depict the flexible\ndistribution in real data, but contain continuous labels, which is beyond the\nscope of Focal Loss. We then propose Generalized Focal Loss (GFL) that\ngeneralizes Focal Loss from its discrete form to the continuous version for\nsuccessful optimization. On COCO test-dev, GFL achieves 45.0\\% AP using\nResNet-101 backbone, surpassing state-of-the-art SAPD (43.5\\%) and ATSS\n(43.6\\%) with higher or comparable inference speed, under the same backbone and\ntraining settings. Notably, our best model can achieve a single-model\nsingle-scale AP of 48.2\\%, at 10 FPS on a single 2080Ti GPU. Code and models\nare available at https://github.com/implus/GFocal.\n