Publication | Closed Access
TF-Blender: Temporal Feature Blender for Video Object Detection
170
Citations
37
References
2021
Year
Temporal Feature BlenderMachine VisionImage AnalysisData ScienceMachine LearningPattern RecognitionSpatial InformationVideo ProcessingEngineeringVideo Content AnalysisVideo Objection DetectionComputer ScienceVideo UnderstandingTemporal InformationDeep LearningVideo RetrievalVideo InterpretationComputer Vision
Video objection detection is a challenging task because isolated video frames may encounter appearance deterioration, which introduces great confusion for detection. One of the popular solutions is to exploit the temporal information and enhance per-frame representation through aggregating features from neighboring frames. Despite achieving improvements in detection, existing methods focus on the selection of higher-level video frames for aggregation rather than modeling lower-level temporal relations to increase the feature representation. To address this limitation, we propose a novel solution named TF-Blender, which includes three modules: 1) Temporal relation models the relations between the current frame and its neigh-boring frames to preserve spatial information. 2). Feature adjustment enriches the representation of every neigh-boring feature map; 3) Feature blender combines outputs from the first two modules and produces stronger features for the later detection tasks. For its simplicity, TF-Blender can be effortlessly plugged into any detection network to improve detection behavior. Extensive evaluations on ImageNet VID and YouTube-VIS benchmarks indicate the performance guarantees of using TF-Blender on recent state-of-the-art methods. Code is available at https://github.com/goodproj13/TF-Blender.
| Year | Citations | |
|---|---|---|
Page 1
Page 1