Publication | Closed Access
Semi-Supervised Deep Learning for Monocular Depth Map Prediction
685
Citations
24
References
2017
Year
Unknown Venue
Geometric LearningMachine VisionMachine LearningData ScienceImage AnalysisPattern RecognitionStereo VisionEngineeringDepth Map Prediction3D VisionScene UnderstandingSemi-supervised Deep LearningDepth MapDeep LearningScene ModelingMonocular ImagesComputer Vision
Supervised deep learning for monocular depth map prediction is limited by scarce dense ground‑truth data, and LiDAR‑derived depth is noisy, sparsely sampled, and mis‑calibrated, making accurate depth estimation in realistic outdoor scenes difficult. This work introduces a semi‑supervised method that learns depth from monocular images. The approach combines sparse ground‑truth supervision with a direct image‑alignment loss that enforces photoconstistency of dense depth maps in a stereo configuration. Experiments show that the method outperforms existing state‑of‑the‑art techniques in single‑image depth prediction.
Supervised deep learning often suffers from the lack of sufficient training data. Specifically in the context of monocular depth map prediction, it is barely possible to determine dense ground truth depth images in realistic dynamic outdoor environments. When using LiDAR sensors, for instance, noise is present in the distance measurements, the calibration between sensors cannot be perfect, and the measurements are typically much sparser than the camera images. In this paper, we propose a novel approach to depth map prediction from monocular images that learns in a semi-supervised way. While we use sparse ground-truth depth for supervised learning, we also enforce our deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss. In experiments we demonstrate superior performance in depth map prediction from single images compared to the state-of-the-art methods.
| Year | Citations | |
|---|---|---|
Page 1
Page 1