Multi-Agent Inverse Reinforcement Learning

TLDR

Inverse reinforcement learning learns an agent's reward function from observed behavior and is used for learning from demonstration or apprenticeship learning. This work introduces multi‑agent inverse reinforcement learning, learning reward functions for multiple agents from their uncoordinated behavior. A centralized controller coordinates the agents by optimizing a weighted sum of their reward functions and is evaluated on a traffic‑routing domain where it regulates traffic signals to control density. The learner not only matches but significantly outperforms the expert.

Abstract

Learning the reward function of an agent by observing its behavior is termed inverse reinforcement learning and has applications in learning from demonstration or apprenticeship learning. We introduce the problem of multi-agent inverse reinforcement learning, where reward functions of multiple agents are learned by observing their uncoordinated behavior. A centralized controller then learns to coordinate their behavior by optimizing a weighted sum of reward functions of all the agents. We evaluate our approach on a traffic-routing domain, in which a controller coordinates actions of multiple traffic signals to regulate traffic density. We show that the learner is not only able to match but even significantly outperform the expert.

References

Page 1

	Year	Citations

Page 1