Publication | Closed Access
UAV Air Combat Autonomous Maneuver Decision Based on DDPG Algorithm
79
Citations
14
References
2019
Year
Unknown Venue
Artificial IntelligenceTrajectory PlanningAerial RoboticsMachine LearningAerospace EngineeringDdpg AlgorithmEngineeringUnmanned SystemAir Vehicle SystemSystems EngineeringFlying RobotComputer ScienceDdpg Replay BufferRobot LearningLearning ControlUnmanned VehicleTraditional Reinforcement LearningTrajectory Optimization
Based on the reinforcement learning theory, this paper establishes the learning model of the autonomous air combat maneuver decision of the UAV. Aiming at the problem that traditional reinforcement learning and DQN algorithm cannot deal with continuous action space, a policy gradient based DDPG algorithm is adopted, which enables the model to output continuous and smooth control values, thus improving the accuracy of autonomous control of UAV. The DDPG algorithm explores the action space by adding noise to the action values. The randomly generated initial action value combination contains a large number of invalid or low-quality individuals, which leads to inefficient learning and localization. Aiming at this problem, this paper proposes to use the optimization algorithm to generate the air combat maneuver action value, and add the optimization action as the initial sample to the DDPG replay buffer. This method filters out a large number of invalid action values and guarantees the correctness of the action value. At the same time, the possibility of exploring diversity is preserved, and the learning efficiency of the DDPG algorithm is improved.
| Year | Citations | |
|---|---|---|
Page 1
Page 1