The role of exploration in learning control

Abstract

Whenever an intelligent agent learns to control an unknown environment, two opposing objectives have to be combined. On the one hand, the environment must be su ciently explored in order to identify a (sub-) optimal controller. For instance, a robot facing an unknown environment has to spend time moving around and acquiring knowledge. On the other hand, the environment must also be exploited during learning, i.e., experience made during learning must also be considered for action selection, if one is interested in minimizing costs of learning. For example, although a robot has to explore its environment, it should avoid collisions with obstacles once it has received some negative reward for collisions. For e cient learning, actions should thus be generated in such a way that the environment is explored and pain is avoided. This fundamental trade-o between exploration and exploitation demands e cient exploration capabilities, maximizing the e ect of learning while minimizing the costs of exploration.

References

Page 1

	Year	Citations

Page 1