Safe Opponent Exploitation

Abstract

We consider the problem of playing a repeated two-player zero-sum game safety: that is, guaranteeing at least the value of the game per period in expectation regardless of the strategy used by the opponent. Playing a stage-game equilibrium strategy at each time step clearly guarantees safety, and prior work has (incorrectly) stated that it is impossible to simultaneously deviate from a stage-game equilibrium (in hope of exploiting a suboptimal opponent) and to guarantee safety. We show that such profitable deviations are indeed possible specifically in games where certain types of “gift” strategies exist, which we define formally. We show that the set of strategies constituting such gifts can be strictly larger than the set of iteratively weakly-dominated strategies; this disproves another recent assertion which states that all noniteratively weakly dominated strategies are best responses to each equilibrium strategy of the other player. We present a full characterization of safe strategies, and develop efficient algorithms for exploiting suboptimal opponents while guaranteeing safety. We also provide analogous results for extensive-form games of perfect and imperfect information, and present safe exploitation algorithms and full characterizations of safe strategies for those settings as well. We present experimental results in Kuhn poker, a canonical test problem for game-theoretic algorithms. Our experiments show that (1) aggressive safe exploitation strategies significantly outperform adjusting the exploitation within stage-game equilibrium strategies only and (2) all the safe exploitation strategies significantly outperform a (nonsafe) best response strategy against strong dynamic opponents.

References

Page 1

	Year	Citations

Page 1