Publication | Open Access
Sparse4D v2: Recurrent Temporal Fusion with Sparse Model
20
Citations
0
References
2023
Year
EngineeringMachine LearningVideo ProcessingSparse Perception AlgorithmSparse FeaturesImage AnalysisData SciencePattern RecognitionFusion LearningMultimodal Sensor FusionTemporal FusionComputational ImagingMachine VisionComputer ScienceVideo UnderstandingDeep LearningFeature FusionComputer VisionScene UnderstandingRecurrent Temporal Fusion
Sparse algorithms offer great flexibility for multi-view temporal perception tasks. In this paper, we present an enhanced version of Sparse4D, in which we improve the temporal fusion module by implementing a recursive form of multi-frame feature sampling. By effectively decoupling image features and structured anchor features, Sparse4D enables a highly efficient transformation of temporal features, thereby facilitating temporal fusion solely through the frame-by-frame transmission of sparse features. The recurrent temporal fusion approach provides two main benefits. Firstly, it reduces the computational complexity of temporal fusion from $O(T)$ to $O(1)$, resulting in significant improvements in inference speed and memory usage. Secondly, it enables the fusion of long-term information, leading to more pronounced performance improvements due to temporal fusion. Our proposed approach, Sparse4Dv2, further enhances the performance of the sparse perception algorithm and achieves state-of-the-art results on the nuScenes 3D detection benchmark. Code will be available at \url{https://github.com/linxuewu/Sparse4D}.