Publication | Closed Access
Multi-level Fusion Based 3D Object Detection from Monocular Images
396
Citations
32
References
2018
Year
Unknown Venue
3D Computer VisionMachine VisionImage AnalysisEngineering3D VisionPattern RecognitionObject DetectionRgb ImageField RoboticsExtended RealityPoint Cloud ProcessingDepth MapDeep LearningLocalization3D Object RecognitionMultilevel FusionComputer VisionSingle Rgb Image
In this paper, we present an end-to-end multi-level fusion based framework for 3D object detection from a single monocular image. The whole network is composed of two parts: one for 2D region proposal generation and another for simultaneously predictions of objects' 2D locations, orientations, dimensions, and 3D locations. With the help of a stand-alone module to estimate the disparity and compute the 3D point cloud, we introduce the multi-level fusion scheme. First, we encode the disparity information with a front view feature representation and fuse it with the RGB image to enhance the input. Second, features extracted from the original input and the point cloud are combined to boost the object detection. For 3D localization, we introduce an extra stream to predict the location information from point cloud directly and add it to the aforementioned location prediction. The proposed algorithm can directly output both 2D and 3D object detection results in an end-to-end fashion with only a single RGB image as the input. The experimental results on the challenging KITTI benchmark demonstrate that our algorithm significantly outperforms monocular state-of-the-art methods.
| Year | Citations | |
|---|---|---|
Page 1
Page 1