A Data-Driven Multi-Agent Autonomous Voltage Control Framework Using Deep Reinforcement Learning

TLDR

Modern power grids are increasingly complex due to renewable integration and rapid demand response, challenging conventional control systems. This work proposes a data‑driven multi‑agent voltage control framework that employs deep reinforcement learning. The authors model autonomous voltage control as a Markov game, partition agents heuristically, and develop a multi‑agent MADDPG algorithm with centralized training and decentralized execution. The algorithm learns from scratch, mastering system operation rules, and case studies on an Illinois 200‑bus system demonstrate its effectiveness under load/generation variations, N‑1 contingencies, and limited communication.

Abstract

The complexity of modern power grids keeps increasing due to the expansion of renewable energy resources and the requirement of fast demand responses, which results in a great challenge for conventional power grid control systems. Existing autonomous control approaches for the power grid requires an accurate system model and a powerful computational platform, which is difficult to scale up for the large-scale energy system with more control options and operating conditions. Facing these challenges, this article proposes a data-driven multi-agent power grid control scheme using a deep reinforcement learning (DRL) method. Specifically, the classic autonomous voltage control (AVC) problem is taken as an example and formulated as a Markov Game with a heuristic method to partition agents. Then, a multi-agent AVC (MA-AVC) algorithm based on a multi-agent deep deterministic policy gradient (MADDPG) method that features centralized training and decentralized execution is developed to solve the AVC problem. The proposed method can learn from scratch and gradually master the system operation rules by input and output data. In order to demonstrate the effectiveness of the proposed MA-AVC algorithm, comprehensive case studies are conducted on an Illinois 200-Bus system considering load/generation changes, N-1 contingencies, and weak centralized communication environment.

References

Page 1

	Year	Citations

Page 1