Publication | Closed Access
Indoor Navigation for Mobile Agents: A Multimodal Vision Fusion Model
22
Citations
26
References
2020
Year
Unknown Venue
EngineeringMachine LearningField RoboticsMulti-sensor Information FusionLocalizationImage AnalysisPattern RecognitionMultimodal Sensor FusionRobot LearningMobile AgentsRobotics PerceptionMachine VisionObject DetectionVision RoboticsDeep LearningAutonomous NavigationComputer VisionEye TrackingScene UnderstandingPath LengthRoboticsScene ModelingLeverage Visual Information
Indoor navigation is a challenging task for mobile agents. The latest vision-based indoor navigation methods make remarkable progress in this field but do not fully leverage visual information for policy learning and struggle to perform well in unseen scenes. To address the existing limitations, we present a multimodal vision fusion model (MVFM). We implement a joint modality of different image recognition networks for navigation policy learning. The proposed model incorporates object detection for target searching, depth estimation for distance prediction, and semantic segmentation to depict the walkable region. In design, our model provides holistic vision knowledge for navigation. Evaluation on AI2-THOR indicates that MVFM improves on the results of a strong baseline model by 3.49% for Success weighted by Path Length (SPL) and 4% for success rate respectively. In comparison with other state-of-the-art systems, MVFM performs in the lead in terms of SPL and success rate. Extensive experiments show the effectiveness of the proposed model.
| Year | Citations | |
|---|---|---|
Page 1
Page 1