Publication | Closed Access
Experience-weighted Attraction Learning in Normal Form Games
1.5K
Citations
49
References
1999
Year
Experience‑weighted attraction (EWA) learning blends reinforcement and weighted fictitious play by updating strategy attractions based on payoff experience and converting them into choice probabilities via a rule such as the logit. EWA introduces a δ parameter that scales hypothetical reinforcement of unchosen strategies relative to chosen ones, along with discount rates φ and ρ that separately decay past attractions, and an experience weight, and the model is calibrated on experimental data to predict hold‑out samples. Estimated parameters show δ≈0.5, φ≈0.8–1, and ρ ranging from 0 to φ, and EWA consistently outperforms its reinforcement and belief‑learning special cases, though belief models excel in some constant‑sum games, demonstrating that EWA combines flexible attraction growth with substantial unchosen‑strategy reinforcement.
In 'experience-weighted attraction' (EWA) learning, strategies have attractions that reflect initial predispositions, are updated based on payoff experience, and determine choice probabilities according to some rule (e.g., logit). A key feature is a parameter δ that weights the strength of hypothetical reinforcement of strategies that were not chosen according to the payoff they would have yielded, relative to reinforcement of chosen strategies according to received payoffs. The other key features are two discount rates, φ and ρ, which separately discount previous attractions, and an experience weight. EWA includes reinforcement learning and weighted fictitious play (belief learning) as special cases, and hybridizes their key elements. When δ= 0 and ρ= 0, cumulative choice reinforcement results. When δ= 1 and ρ=φ, levels of reinforcement of strategies are exactly the same as expected payoffs given weighted fictitious play beliefs. Using three sets of experimental data, parameter estimates of the model were calibrated on part of the data and used to predict a holdout sample. Estimates of δ are generally around .50, φ around .8 − 1, and ρ varies from 0 to φ. Reinforcement and belief-learning special cases are generally rejected in favor of EWA, though belief models do better in some constant-sum games. EWA is able to combine the best features of previous approaches, allowing attractions to begin and grow flexibly as choice reinforcement does, but reinforcing unchosen strategies substantially as belief-based models implicitly do.
| Year | Citations | |
|---|---|---|
Page 1
Page 1