Chaos in learning a simple two-person game

TLDR

The study investigates learning to play rock–paper–scissors, including the non‑zero‑sum variant that can produce chaotic transients. The authors use reinforcement learning to adjust response frequencies and improve average scores. The learning dynamics exhibit Hamiltonian chaos, with trajectories ranging from simple to complex depending on initial conditions, marking the first demonstration of such chaos in a basic two‑person game and suggesting that chaotic behavior signals when players may act rationally, cautioning against assuming Nash‑equilibrium play.

Abstract

We investigate the problem of learning to play the game of rock–paper–scissors. Each player attempts to improve her/his average score by adjusting the frequency of the three possible responses, using reinforcement learning. For the zero sum game the learning process displays Hamiltonian chaos. Thus, the learning trajectory can be simple or complex, depending on initial conditions. We also investigate the non-zero sum case and show that it can give rise to chaotic transients. This is, to our knowledge, the first demonstration of Hamiltonian chaos in learning a basic two-person game, extending earlier findings of chaotic attractors in dissipative systems. As we argue here, chaos provides an important self-consistency condition for determining when players will learn to behave as though they were fully rational. That chaos can occur in learning a simple game indicates one should use caution in assuming real people will learn to play a game according to a Nash equilibrium strategy.

References

Page 1

	Year	Citations

Page 1