Publication | Open Access
Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles
102
Citations
35
References
2017
Year
Artificial IntelligenceBatch Reinforcement LearningTrajectory PlanningMachine LearningEngineeringAutonomous Land VehiclesPbac Learning AlgorithmVehicle ControlIntelligent ControlLongitudinal ControlSystems EngineeringIntelligent SystemsRobot LearningLearning ControlParameterized BatchTrajectory Optimization
This paper presents a parameterized batch reinforcement learning algorithm for near-optimal longitudinal control of autonomous land vehicles (ALVs). The proposed approach uses an actor-critic architecture, where parameterized feature vectors based on kernels are learned from collected samples for approximating the value functions and policies. One difference between the parameterized batch actor-critic (PBAC) algorithm and previous actor-critic learning approaches is that the critic and actor in PBAC share the same linear features, which has been theoretically proved to be a beneficial property for the convergence of actor-critic learning approaches. In order to obtain better learning efficiency, least-squares-based batch updating rules are designed for the critic and actor, respectively. Based on the PBAC learning algorithm, a data-driven longitudinal control method is presented for ALVs to obtain near-optimal control policies which adaptively tune the fuel/brake control signals to track different speeds. A multiobjective reward function is designed so that both tracking precision and driving smoothness are considered. Extensive experiments were conducted on a real ALV platform while driving on flat, slippery, sloping, and bumpy roads. The experimental results illustrate the superiority of the PBAC-based self-learning controller over conventional longitudinal control methods such as proportional-integral (PI) control and learning-based PI control.
| Year | Citations | |
|---|---|---|
Page 1
Page 1