Concepedia

Publication | Open Access

Multi-agent deep reinforcement learning: a survey

734

Citations

172

References

2021

Year

TLDR

Reinforcement learning has achieved remarkable success across many domains, yet multi‑agent reinforcement learning has lagged behind its single‑agent counterpart, only recently gaining traction to tackle real‑world complexity. This article surveys the current developments in multi‑agent deep reinforcement learning. The survey reviews recent literature that merges deep RL with multi‑agent settings, organizing the discussion into three parts: training schemes for multiple agents, emergent cooperative, competitive, and mixed behavior patterns, and the unique challenges and mitigation methods in the multi‑agent domain. The authors conclude by summarizing recent advances, highlighting emerging trends, and proposing directions for future research.

Abstract

Abstract The advances in reinforcement learning have recorded sublime success in various domains. Although the multi-agent domain has been overshadowed by its single-agent counterpart during this progress, multi-agent reinforcement learning gains rapid traction, and the latest accomplishments address problems with real-world complexity. This article provides an overview of the current developments in the field of multi-agent deep reinforcement learning. We focus primarily on literature from recent years that combines deep reinforcement learning methods with a multi-agent scenario. To survey the works that constitute the contemporary landscape, the main contents are divided into three parts. First, we analyze the structure of training schemes that are applied to train multiple agents. Second, we consider the emergent patterns of agent behavior in cooperative, competitive and mixed scenarios. Third, we systematically enumerate challenges that exclusively arise in the multi-agent domain and review methods that are leveraged to cope with these challenges. To conclude this survey, we discuss advances, identify trends, and outline possible directions for future work in this research area.

References

YearCitations

Page 1