High-Speed Ramp Merging Behavior Decision for Autonomous Vehicles Based on Multiagent Reinforcement Learning

Abstract

To improve the decision success rate of a multiagent reinforcement learning algorithm in merging high-speed ramps of autonomous vehicles, the independent proximal policy optimization (IPPO) method is presented. The Markov decision process (MDP) model for autonomous vehicle behavioral decision making is developed. Moreover, the state space, reward function, and action space are all designed. An IPPO method is proposed using independent learning and parameter-sharing strategies based on the proximal policy optimization algorithm. And further, a decision-making model for autonomous driving behavior is built. For simulation experiments, a highway ramp scenario is set. The experiment findings indicate that the IPPO algorithm can significantly increase the decision success rate of autonomous vehicles in the ramp merging assignment. Also, as compared to the MAACKTR and GPPO algorithms, the IPPO algorithm can achieve a better average reward and finish the ramp merging more rapidly.

References

Page 1

	Year	Citations

Page 1