Publication | Closed Access
Fully Sparse Transformer 3-D Detector for LiDAR Point Cloud
18
Citations
43
References
2023
Year
EngineeringMachine LearningObject DetectorPoint Cloud ProcessingPoint Cloud3D Computer VisionImage AnalysisPattern RecognitionLaser-based SensorComputational GeometryLidar Point CloudMachine VisionObject DetectionDynamic QueriesLidarComputer ScienceDeep Learning3D Object RecognitionComputer Vision
The 3D object detector usually uses a framework similar to 2D detection and benefits from the advancements of 2D detection tasks. In these frameworks, it is necessary to make the unstructured, sparse point cloud features into dense grids to be compatible with popular 2D operators such as convolution and transformers, which also causes extra computational costs. In this paper, we propose a simple and efficient Fully Sparse TRansformer (FSTR) for LiDAR-based 3D object detection, which is able to combine with state-of-the-art sparse backbones to form a fully sparse, end-to-end, simple, and efficient detection framework. FSTR uses the sparse voxel feature from the sparse backbone as the input token without any custom operators. Further, we introduce the dynamic queries to provide a priori location and context of the foreground for the decoder and drop the high-confidence background tokens to further reduce redundant computations. We propose Gaussian denoising queries to speed up the decoder training and make it more adaptable to the distribution of sparse voxel features. Extensive experiments on the nuScenes benchmark and the Argoverse2 benchmark validate the effectiveness of the proposed method. FSTR outperforms all LiDAR real-time methods by 69.5 mAP and 72.9 NDS on the official benchmark of nuScenes dataset. On the long-range detection benchmark Argoverse2, the proposed method achieves a new state-of-art performance of 39.9 mAP which outperforms the existing LiDAR detectors, even the LiDAR-Camera detectors by a large margin (+9.4 mAP and +7.5mAP), showing the great advantage of the proposed method for long-range detection.
| Year | Citations | |
|---|---|---|
Page 1
Page 1