Adaptive linear quadratic control using policy iteration

Concepedia

Publication | Closed Access

DOI

413

Citations

References

2005

Year

Steven J. Bradtke, B. Erik Ydstie, Andrew G. Barto

Unknown Venue

Policy IterationEngineeringMathematical Control TheoryIntelligent ControlProcess ControlAdaptive ControlSystems EngineeringQuadratic RegulationContinuous ProblemBusinessReinforcement Learning (Educational Psychology)Learning ControlParticular Signal VectorDynamic OptimizationStability

Abstract

In this paper we present the stability and convergence results for dynamic programming-based reinforcement learning applied to linear quadratic regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to an optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. This is the first convergence result for DP-based reinforcement learning algorithms for a continuous problem.

References

Page 1

	Year	Citations

Page 1