Monocular Localization in HD Maps by Combining Semantic Segmentation and Distance Transform

Abstract

Easy, yet robust long-term localization is still an open topic in research. Existing approaches require either dense maps, expensive sensors, specialized map features or proprietary detectors.We propose using semantic segmentation on a monocular camera to localize directly in a HD map as used for automated driving. This combines lightweight, yet powerful HD maps with the simplicity of monocular vision and the flexibility of neural networks.The major challenges arising from this combination are data association and robustness against misdetections. Association is solved efficiently by applying distance transform on binary per-class images. This provides not only a fast lookup table for a smooth gradient as needed for pose-graph optimization, but also dynamic association by default.A sliding-window pose graph optimization combines single image detections with vehicle odometry, smoothing results and helping overcome even misclassifications in consecutive frames.Evaluation against a highly accurate 6D visual localization shows that our approach can achieve accuracy levels as required for automated driving, being one of the most lightweight and flexible methods to do so.

References

Page 1

	Year	Citations

Page 1