Publication | Closed Access
Unsupervised Learning of Depth and Ego-Motion from Video
2.8K
Citations
46
References
2017
Year
Unknown Venue
Image AnalysisMachine VisionMachine LearningUnsupervised Learning FrameworkPattern RecognitionEngineering3D VisionScene UnderstandingVideo HallucinationDepth MapMonocular DepthStructure From MotionRobot LearningDeep LearningPose EstimationMulti-view GeometryScene ModelingComputer Vision
We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences. In common with recent work [10, 14, 16], we use an end-to-end learning approach with view synthesis as the supervisory signal. In contrast to the previous work, our method is completely unsupervised, requiring only monocular video sequences for training. Our method uses single-view depth and multiview pose networks, with a loss based on warping nearby views to the target using the computed depth and pose. The networks are thus coupled by the loss during training, but can be applied independently at test time. Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performs favorably compared to established SLAM systems under comparable input settings.
| Year | Citations | |
|---|---|---|
Page 1
Page 1