Publication | Closed Access
Faster R-CNN Learning-Based Semantic Filter for Geometry Estimation and Its Application in vSLAM Systems
16
Citations
26
References
2021
Year
Geometric LearningEngineeringVslam Systems3D Computer VisionImage AnalysisSemantic InformationStereo VisionImage-based ModelingEpipolar GeometryComputational ImagingGeometry EstimationComputational GeometryGeometric ModelingMachine VisionGeometric Feature ModelingEpipolar Geometry EstimationStructure From MotionDeep Learning3D Object RecognitionComputer Vision3D VisionNatural SciencesComputer Stereo Vision
Epipolar geometry is a fundamental constraint used in computer vision systems to estimate parameters using correspondence. The most common way to describe epipolar geometry is by means of a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3\times 3$ </tex-math></inline-formula> matrix called the fundamental matrix, and such matrices are used to store the precise geometric information relating a pair of stereo images. Its efficient estimation substantially improves initialization in visual simultaneous localization and mapping (vSLAM), which uses correspondence-based epipolar geometry to determine the trajectory of the camera and a three-dimensional scene. Conventional robust methods for epipolar geometry estimation can become computationally inefficient and inaccurate when there are low-quality correspondences. Because semantic information can be more stable than pixel intensities/descriptors, a novel Faster Region-based Convolutional Network (R-CNN) learning-based approach called the semantic filter is proposed in this paper to address these problems. The semantic filter is first trained on different semantic patches, which are described in terms of their different outlier distributions, providing different semantic labels for image contexts. Then, the patches with low-level semantic labels are filtered out. Finally, precise and robust correspondences can be determined by matches using the high-level semantic contexts, making the correspondence-based calculation more accurate. For dynamic outdoor scenes, the results of extensive experiments show that our semantic filter can help vSLAM localize accurately and robustly on a map from different viewpoints. In a completely static scenario, our semantic filter can remove the low-quality correspondences, enabling the mobile robot to operate well.
| Year | Citations | |
|---|---|---|
Page 1
Page 1