Air-Combat Strategy Using Approximate Dynamic Programming

TLDR

Unmanned aircraft systems can perform many dangerous missions, but the complexity of air combat has prevented autonomous execution, and in this study the learning aircraft is given a slight performance advantage. The study formulates a level‑flight fixed‑velocity one‑on‑one air‑combat maneuvering problem and uses approximate dynamic programming to compute an efficient approximation of the optimal policy. The method employs approximate dynamic programming with extensive feature development, reward shaping, and trajectory sampling, and incorporates a fast rollout‑based policy extraction for online implementation. Simulation and flight tests demonstrate that the approach yields fast responses to rapidly changing tactical situations, long planning horizons, robust performance against opponents from both offensive and defensive positions, and effective operation on MIT’s real‑time indoor autonomous vehicle test environment.

Abstract

Unmanned aircraft systems have the potential to perform many of the dangerous missions currently flown by manned aircraft, yet the complexity of some tasks, such as air combat, have precluded unmanned aircraft systems from successfully carrying out these missions autonomously. This paper presents a formulation of a level-flight fixed-velocity one-on-one air-combat maneuvering problem and an approximate dynamic programming approach for computing an efficient approximation of the optimal policy. In the version of the problem formulation considered, the aircraft learning the optimal policy is given a slight performance advantage. This approximate dynamic programming approach provides a fast response to a rapidly changing tactical situation, long planning horizons, and good performance, without explicit coding of air-combat tactics. The method's success is due to extensive feature development, reward shaping, and trajectory sampling. An accompanying fast and effective rollout-based policy extraction method is used to accomplish online implementation. Simulation results are provided that demonstrate the robustness of the method against an opponent, beginning from both offensive and defensive situations. Flight results are also presented using unmanned aircraft systems flown at the Massachusetts Institute of Technology's real-time indoor autonomous vehicle test environment.

References

Page 1

	Year	Citations

Page 1