Concepedia

TLDR

A COIN is a large multi‑agent system with minimal centralized control and a global utility function that evaluates system histories. The paper surveys how to design COINs, focusing on systems where each agent uses reinforcement learning, and seeks to solve the design problem implicitly through the agents’ adaptive RL behavior. The authors investigate which individual agent reward functions, when optimized by RL, lead to high global utility and avoid collective failures such as the tragedy of the commons, Braess’s paradox, or liquidity traps. Early research on COIN design has achieved successes in artificial domains such as packet routing, leader‑follower coordination, and the El Farol bar problem, and is expected to broaden engineering tasks and offer insights into economics, game theory, and population biology.

Abstract

This paper surveys the emerging science of how to design a ``COllective INtelligence'' (COIN). A COIN is a large multi-agent system where: (i) There is little to no centralized communication or control; and (ii) There is a provided world utility function that rates the possible histories of the full system. In particular, we are interested in COINs in which each agent runs a reinforcement learning (RL) algorithm. Rather than use a conventional modeling approach (e.g., model the system dynamics, and hand-tune agents to cooperate), we aim to solve the COIN design problem implicitly, via the ``adaptive'' character of the RL algorithms of each of the agents. This approach introduces an entirely new, profound design problem: Assuming the RL algorithms are able to achieve high rewards, what reward functions for the individual agents will, when pursued by those agents, result in high world utility? In other words, what reward functions will best ensure that we do not have phenomena like the tragedy of the commons, Braess's paradox, or the liquidity trap? Although still very young, research specifically concentrating on the COIN design problem has already resulted in successes in artificial domains, in particular in packet-routing, the leader-follower problem, and in variants of Arthur's El Farol bar problem. It is expected that as it matures and draws upon other disciplines related to COINs, this research will greatly expand the range of tasks addressable by human engineers. Moreover, in addition to drawing on them, such a fully developed scie nce of COIN design may provide much insight into other already established scientific fields, such as economics, game theory, and population biology.

References

YearCitations

Page 1