Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression

TLDR

Bounding box regression is essential for object detection, yet conventional ℓn‑norm loss does not align with the IoU evaluation metric, and recent IoU/GIoU losses still converge slowly and yield inaccurate predictions. This work introduces Distance‑IoU (DIoU) and Complete‑IoU (CIoU) losses that incorporate normalized distance and geometric factors to accelerate training and enhance detection accuracy. DIoU adds a normalized center‑point distance term to the loss, while CIoU extends it with overlap area and aspect‑ratio components, and both losses can be seamlessly integrated into existing detectors and used as an NMS criterion. Applying DIoU and CIoU to YOLO‑v3, SSD, and Faster‑R‑CNN yields significant improvements in IoU and GIoU metrics, and the authors provide source code and pretrained models on GitHub.

Abstract

Bounding box regression is the crucial step in object detection. In existing methods, while ℓn-norm loss is widely adopted for bounding box regression, it is not tailored to the evaluation metric, i.e., Intersection over Union (IoU). Recently, IoU loss and generalized IoU (GIoU) loss have been proposed to benefit the IoU metric, but still suffer from the problems of slow convergence and inaccurate regression. In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, i.e., overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. By incorporating DIoU and CIoU losses into state-of-the-art object detection algorithms, e.g., YOLO v3, SSD and Faster R-CNN, we achieve notable performance gains in terms of not only IoU metric but also GIoU metric. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement. The source code and trained models are available at https://github.com/Zzh-tju/DIoU.

References

Page 1

	Year	Citations

Page 1