An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning

Abstract

Autonomous navigation is a key technology of multi-UAV systems, and deep reinforcement learning can endow UAVs with powerful autonomous decision-making capabilities. To improve the convergence speed and stability of reinforcement learning, this paper proposes a multi-agent deep deterministic policy gradient algorithm based on prioritized experience replay, namely PER-MADDPG. This algorithm makes the samples with higher priority have a higher probability of being chosen for the parameter update, which can speed up the algorithm convergence. Moreover, the actions of UAVs are generated utilizing parameter noise, which can improve the stability and robustness of the algorithm. Experiments show that PER-MADDPG has fast convergence speed and good convergence results, and has excellent autonomous navigation capabilities.

References

Page 1

	Year	Citations

Page 1