Concepedia

Publication | Closed Access

Pupil Diameter Predicts Changes in the Exploration–Exploitation Trade-off: Evidence for the Adaptive Gain Theory

454

Citations

42

References

2010

Year

TLDR

The adaptive regulation of the exploration–exploitation balance is crucial for optimal performance, and animal and computational studies suggest that the locus coeruleus–norepinephrine system mediates utility‑driven shifts in control state, with pupil diameter serving as an indirect marker of LC activity. The study investigated how pupil diameter relates to task utility and choice strategy, testing whether baseline and dynamic pupil changes predict exploration versus exploitation. Participants’ pupil diameter was recorded while they performed a gambling task with a gradually changing payoff structure, and each choice was classified as exploitative or exploratory using a computational reinforcement learning model. Exploratory choices were preceded by larger baseline pupil diameter, baseline pupil size predicted individual exploration tendency, and pupil changes around choice transitions tracked task utility, providing evidence that pupil diameter reflects LC–NE mediated control state in the exploration–exploitation trade‑off.

Abstract

Abstract The adaptive regulation of the balance between exploitation and exploration is critical for the optimization of behavioral performance. Animal research and computational modeling have suggested that changes in exploitative versus exploratory control state in response to changes in task utility are mediated by the neuromodulatory locus coeruleus–norepinephrine (LC–NE) system. Recent studies have suggested that utility-driven changes in control state correlate with pupil diameter, and that pupil diameter can be used as an indirect marker of LC activity. We measured participants' pupil diameter while they performed a gambling task with a gradually changing payoff structure. Each choice in this task can be classified as exploitative or exploratory using a computational model of reinforcement learning. We examined the relationship between pupil diameter, task utility, and choice strategy (exploitation vs. exploration), and found that (i) exploratory choices were preceded by a larger baseline pupil diameter than exploitative choices; (ii) individual differences in baseline pupil diameter were predictive of an individual's tendency to explore; and (iii) changes in pupil diameter surrounding the transition between exploitative and exploratory choices correlated with changes in task utility. These findings provide novel evidence that pupil diameter correlates closely with control state, and are consistent with a role for the LC–NE system in the regulation of the exploration–exploitation trade-off in humans.

References

YearCitations

Page 1