Concepedia

Abstract

We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest, in an indoor environment, solely from its visual inputs. While scene-driven or recognition-driven visual navigation has been widely studied, prior efforts suffer severely from the limited generalization capability. In this letter, we first argue the object searching task is environment-dependent while the approaching ability is general. To learn a generalizable approaching policy, we present a novel solution dubbed as Generalizable Approaching Policy LEarning, which adopts two channels of visual features: depth and semantic segmentation, as the inputs to the policy learning module. The empirical studies conducted on the House3D dataset and on a physical platform in a real-world scenario validate our hypothesis, and we further provide in-depth qualitative analysis.

References

YearCitations

Page 1