Safe Reinforcement Learning for Autonomous Vehicle Using Monte Carlo Tree Search

Abstract

Reinforcement learning has gradually demonstrated its decision-making ability in autonomous driving. Reinforcement learning is learning how to map states to actions by interacting with environment so as to maximize the long-term reward. Within limited interactions, the learner will get a suitable driving policy according to the designed reward function. However there will be a lot of unsafe behaviors during training in traditional reinforcement learning. This paper proposes a RL-based method combined with RL agent and Monte Carlo tree search algorithm to reduce unsafe behaviors. The proposed safe reinforcement learning framework mainly consists of two modules: risk state estimation module and safe policy search module. Once the future state will be risky calculated by the risk state estimation module using current state information and the action outputted by the RL agent, the MCTS based safe policy search module will activate to guarantee a safer exploration by adding an additional reward for risk actions. We test the approach in several random overtake scenarios, resulting in faster convergence and safer behaviors compared to traditional reinforcement learning.

References

Page 1

	Year	Citations
Human-level control through deep reinforcement learning Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Nature Artificial IntelligenceEngineeringDeep Reinforcement LearningReinforcement Learning (Educational Psychology)Computer Science	2015	28.8K
Mastering the game of Go with deep neural networks and tree search David Silver, Aja Huang, Chris J. Maddison, Nature Artificial IntelligenceGame AiDeep Neural NetworksEngineeringMachine Learning	2016	15.5K
Reinforcement Learning: An Introduction Jeffrey D. Johnson, Jinghong Li, Zengshi Chen Neurocomputing Artificial IntelligenceEngineeringDeep Reinforcement LearningComputer ScienceRobot Learning	2000	8.7K
Learning to Forget: Continual Prediction with LSTM Felix A. Gers, Jürgen Schmidhuber, Fred Cummins Neural Computation	2000	5.3K
Congested traffic states in empirical observations and microscopic simulations Martin Treiber, Ansgar Hennecke, Dirk Helbing Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics Uphill GradientsEngineeringTraffic TheoryTraffic FlowRoad Inhomogeneities	2000	4.4K
End to End Learning for Self-Driving Cars Mariusz Bojarski, Davide Testa, Daniel Dworakowski, arXiv (Cornell University) Artificial IntelligenceConvolutional Neural NetworkEngineeringMachine LearningAdvanced Driver-assistance System	2016	3.1K
A Survey of Monte Carlo Tree Search Methods Cameron Browne, Edward J. Powley, Daniel Whitehouse, IEEE Transactions on Computational Intelligence and AI in Games Artificial IntelligenceComputational ScienceEngineeringData ScienceMcts Methods	2012	2.9K
Autonomous driving in urban environments: Boss and the Urban Challenge Chris Urmson, Joshua Anhalt, Drew Bagnell, Journal of Field Robotics EngineeringGlobal PlanningField RoboticsAutonomous Vehicle NavigationUrban Challenge	2008	1.5K
General Lane-Changing Model MOBIL for Car-Following Models Arne Kesting, Martin Treiber, Dirk Helbing Transportation Research Record Journal of the Transportation Research Board Traffic TheoryEngineeringTraffic EnforcementOverall BrakingMicroscopic Traffic Models	2007	1.2K
A comprehensive survey on safe reinforcement learning Javier García, Fernando Fernández Journal of Machine Learning Research Artificial IntelligenceEngineeringMachine LearningSafety ScienceMulti-agent Learning	2015	1.2K

Page 1