The 2018 International Conference on Unmanned Aircraft Systems

Abstract

The autonomous landing of an unmanned aerial\nvehicle (UAV) is still an open problem. Previous work focused\non the use of hand-crafted geometric features and sensor-data\nfusion for identifying a fiducial marker and guide the UAV\ntoward it. In this article we propose a method based on deep\nreinforcement learning that only requires low-resolution images\ncoming from a down looking camera in order to drive the\nvehicle. The proposed approach is based on a hierarchy of Deep\nQ-Networks (DQNs) that are used as high-end control policy\nfor the navigation in different phases. We implemented various\ntechnical solutions, such as the combination of vanilla and\ndouble DQNs trained using a form of prioritized buffer replay\nthat separates experiences in multiple containers. The optimal\ncontrol policy is learned without any human supervision,\nproviding the agent with a sparse reward feedback indicating\nthe success or failure of the landing. The results show that\nthe quadrotor can autonomously land on a large variety of\nsimulated environments and with relevant noise, proving that\nthe underline DQNs are able to generalise effectively on unseen\nscenarios. Furthermore, it was proved that in some conditions\nthe network outperformed human pilots.