Concepedia

Publication | Closed Access

Learning evaluation functions for global optimization

60

Citations

50

References

1998

Year

Abstract

In complex sequential decision problems suchasscheduling factory production, planning medical treatments, and playing backgammon, optimal decision policies are in general unknown, and it is often difficult, even for human domain experts, to manually encode good decision policies in software. The reinforcement-learning methodology of "value function approximation" (VFA) offers an alternative: systems can learn effective decision policies autonomously, simply by simulating the task and keeping statistics on which decisions lead to good ultimate performance and which do not. This thesis advances the state of the art in VFA in two ways. First, it

References

YearCitations

Page 1