Execution-time communication decisions for coordination of multi-agent teams

Abstract

Although multi-agent teams provide additional functionality and robustness over single-agent systems, they also present additional challenges, mainly due to the difficulty of coordinating multiple agents in the presence of uncertainty and partial observability. Agents must reason about the collective state and behaviors of the team as well as uncertainty in their own environment. In this thesis, we employ Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs), an extension of single-agent POMDPs that can be used to model and coordinate teams of agents. Although the problem of finding optimal policies for Dec-POMDPs is highly intractable, it is known that the presence of free communication transforms a multi-agent Dec-POMDP into a more tractable single-agent POMDP. We use this transformation to generate policies for multi-agent teams modeled by Dec-POMDPs. We facilitate the decentralize execution of these centralized policies by providing algorithms that allow agents to reason about communication at execution-time. Our approach trades off the need to do some computation at execution-time for the ability to generate policies more tractably at plan-lime. This thesis explores the question of how communication can be used effectively to enable the coordination of cooperative multi-agent teams making sequential decisions under uncertainty and partial observability. We identify two fundamental questions that must be answered when reasoning about communication: When should agents and What should agents communicate? We present two basic approaches to enabling a team of distributed agents to avoid coordination errors, The first is an algorithm that reasons over the possible joint beliefs the team. We provide algorithms that address the questions of when and what agents should communicate. The second approach presented in this thesis avoids coordination errors by creating individual factored policy for each agent. Factored policies provide a means for determining which state features agents should communicate, answering the questions of when and what agents should communicate. We use factored policies to identify instances of context-specific independence, in which agents can act without needing to consider the actions or observations of their teammates.

References

Page 1

	Year	Citations

Page 1