Publication | Closed Access
A Bottom-Up and Top-Down Integration Framework for Online Object Tracking
13
Citations
63
References
2020
Year
Sparse CodingEngineeringMachine LearningOnline Object TrackingIntelligent SystemsVisual SurveillanceImage AnalysisData SciencePattern RecognitionObject TrackingMachine VisionObject DetectionRobust Online ObjectMoving Object TrackingComputer ScienceVideo UnderstandingDeep LearningComputer VisionEye TrackingParticle GraphTracking System
Robust online object tracking entails integrating short-term memory based trackers and long-term memory based trackers in an elegant framework to handle structural and appearance variations of unknown objects in an online manner. The integration and synergy between short-term and long-term memory based trackers have yet studied well in the literature, especially in pre-training free settings. To address this issue, this paper presents a bottom-up and top-down integration framework. The bottom-up component realizes a data-driven approach for particle generation. It exploits a short-term memory based tracker to generate bounding box proposals in a new frame. In the top-down component, this paper presents a graph regularized sparse coding scheme as the long-term memory based tracker. The over-complete bases for sparse coding are composed of part-based representations learned from earlier tracking results and new observations to form a space with rich temporal context information. A particle graph is computed whose nodes are the bottom-up discriminative particles and edges are formed on-the-fly in terms of appearance and spatial-temporal similarities between particles. The particle graph induces a regularization term in optimizing the sparse coding coefficients for bottom-up particles. In experiments, the proposed method is tested on the widely used OTB-100 benchmark and the VOT2016 benchmark with better performance obtained than baselines including deep learning based trackers. In addition, the outputs from the top-down sparse coding are potentially useful for downstream tasks such as action recognition, multiple-object tracking, and object re-identification.
| Year | Citations | |
|---|---|---|
Page 1
Page 1