Data-Based Optimal Consensus Control for Multiagent Systems With Time Delays: Using Prioritized Experience Replay

Abstract

This article is centered on the optimal consensus problem of the multiagent systems (MASs) with time delays. By designing a new augmented state, the delayed MASs are reformulated as a delay-free system, and each agent is to minimize its local cost that may depend on the decisions of the other agents, which is regarded as a Nash equilibrium problem. To this end, we propose a multiagent deterministic policy gradient (MADPG) method based on actor–critic (AC) networks to minimize the local cost ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$</tex-math> </inline-formula> -function) by introducing the policy gradient technique, and its convergence and optimality are proven as well. In particular, we develop an optimized prioritized experience replay (PER) strategy that allows high-value samples to be selected with a higher probability, which enhance networks’ data utilization. Finally, the effectiveness of the algorithm and the advantages of PER are demonstrated with a simulated example and a comparative simulation.

References

Page 1

	Year	Citations

Page 1