PVC-SSD: Point-Voxel Dual-Channel Fusion With Cascade Point Estimation for Anchor-Free Single-Stage 3-D Object Detection

Abstract

Existing single-stage 3D object detection algorithms, whether relying on point or voxel methodologies, face challenges in achieving high-performance detection across diverse object categories simultaneously. Moreover, current algorithms employing point-voxel approaches often fall short in fully leveraging the advantages offered by the two sparse point cloud feature extraction methods. Consequently, these methodologies inadequately capture both the local and global features of the object. In response to these challenges, we introduce a novel single-stage 3D object detection algorithm called PVC-SSD. This algorithm adopts an anchor-free methodology and employs point-voxel dual-channel fusion encoding to effectively model both local and global features, thereby enhancing the overall performance of object detection. The proposed algorithm comprises three key components. Firstly, the Point-Voxel Dual-Channel Fusion module is designed to seamlessly integrate both local and global features of the object. Secondly, the Cascade Candidate Point Estimation module focuses on improving the quality of candidate points. Lastly, the Position Encoding Self-Attention module is dedicated to establishing pointwise correlations within sparse point clouds. And this module is instrumental in reinforcing foreground features and mitigating geometric differences within the same category induced by factors such as viewpoint and distance. Through extensive experiments conducted on the KITTI and Waymo large scale 3D object detection datasets, we substantiate the robust competitiveness and efficiency of PVC-SSD in multi-category detection tasks.

References

Page 1

	Year	Citations

Page 1