Publication | Open Access
A Simple Baseline for Multi-Object Tracking.
76
Citations
18
References
2020
Year
Artificial IntelligenceMultiple Instance LearningEngineeringMachine LearningHomogeneous BranchesImage AnalysisData SciencePattern RecognitionObject TrackingRobot LearningSimple BaselineMultiple Object TrackingSource CodeMachine VisionFeature LearningObject DetectionMoving Object TrackingComputer ScienceDeep LearningComputer VisionTracking System
Recent advances in object detection and re‑identification have improved multi‑object tracking, yet joint learning of the two tasks remains underexplored and suffers from ambiguity due to ROI‑Align sampling. This work introduces FairMOT, a simple baseline that jointly learns detection and re‑identification through two homogeneous branches. FairMOT replaces ROI‑Align with pixel‑wise objectness and re‑ID feature prediction, ensuring balanced training of both tasks. FairMOT achieves state‑of‑the‑art detection and tracking accuracy on multiple public datasets, reducing identity switches, and the code and pretrained models are publicly available.
There has been remarkable progress on object detection and re-identification (re-ID) in recent years which are the key components of multi-object tracking. However, little attention has been focused on jointly accomplishing the two tasks in a single network. Our study shows that the previous attempts ended up with degraded accuracy mainly because the re-ID task is not fairly learned which causes many identity switches. The unfairness lies in two-fold: (1) they treat re-ID as a secondary task whose accuracy heavily depends on the primary detection task. So training is largely biased to the detection task but ignores the re-ID task; (2) they use ROI-Align to extract re-ID features which is directly borrowed from object detection. However, this introduces a lot of ambiguity in characterizing objects because many sampling points may belong to disturbing instances or background. To solve the problems, we present a simple approach \emph{FairMOT} which consists of two homogeneous branches to predict pixel-wise objectness scores and re-ID features. The achieved fairness between the tasks allows \emph{FairMOT} to obtain high levels of detection and tracking accuracy and outperform previous state-of-the-arts by a large margin on several public datasets. The source code and pre-trained models are released at this https URL.
| Year | Citations | |
|---|---|---|
Page 1
Page 1