The knowledge gradient algorithm for online subset selection - Concepedia

Concepedia

Abstract

We derive a one-period look-ahead policy for online subset selection problems, where learning about one subset also gives us information about other subsets. The subset selection problem is treated as a multi-armed bandit problem with correlated prior beliefs. We show that our decision rule is easily computable, and present experimental evidence that the policy is competitive against other online learning policies.

References

	Year	Citations

Page 1