Publication | Open Access
Bandit problems with infinitely many arms
115
Citations
9
References
1997
Year
Contextual BanditN ChoicesEngineeringStochastic OptimizationStochastic GameGame TheoryUniform DistributionBusinessGame-theoretic ProbabilityProbability TheoryRobot LearningGamesCombinatorial OptimizationMechanism DesignExploration V ExploitationBandit ProblemsBandit Problem Consisting
We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with $n \to \infty$. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between $\sqrt{2}/\sqrt{n}$ and $2/\sqrt{n}$ and we exhibit classes of strategies that achieve the latter.
| Year | Citations | |
|---|---|---|
Page 1
Page 1