Publication | Closed Access
Learning Configurations of Operating Environment of Autonomous Vehicles to Maximize their Collisions
57
Citations
63
References
2022
Year
Artificial IntelligenceEngineeringMachine LearningEducationAutonomous Agent SystemAutonomous SystemsIntelligent SystemsReinforcement Learning (Educational Psychology)Learning ControlLifelong Reinforcement LearningMulti-agent LearningTrajectory PlanningReinforcement Learning (Computer Engineering)Intelligent Autonomous SystemsAutonomous VehiclesSystems EngineeringRobot LearningAutonomous LearningDeepcollision ModelsComputer ScienceAutonomous DrivingInverse Reinforcement LearningDeep Reinforcement LearningOperating EnvironmentAutonomous Intelligent SystemPlanningRoboticsTrajectory Optimization
Autonomous vehicles must operate safely in their dynamic and continuously-changing environment. However, the operating environment of an autonomous vehicle is complicated and full of various types of uncertainties. Additionally, the operating environment has many configurations, including static and dynamic obstacles with which an autonomous vehicle must avoid collisions. Though various approaches targeting environment configuration for autonomous vehicles have shown promising results, their effectiveness in dealing with a continuous-changing environment is limited. Thus, it is essential to learn realistic environment configurations of continuously-changing environment, under which an autonomous vehicle should be tested regarding its ability to avoid collisions. Featured with agents dynamically interacting with the environment, Reinforcement Learning (RL) has shown great potential in dealing with complicated problems requiring adapting to the environment. To this end, we present an RL-based environment configuration learning approach, i.e., <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepCollision</i> , which intelligently learns environment configurations that lead an autonomous vehicle to crash. DeepCollision employs Deep Q-Learning as the RL solution, and selects <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">collision probability</i> as the safety measure, to construct the reward function. We trained four DeepCollision models and conducted an experiment to compare them with two baselines, i.e., random and greedy. Results show that DeepCollision demonstrated significantly better effectiveness in generating collisions compared with the baselines. We also provide recommendations on configuring DeepCollision with the most suitable time interval based on different road structures.
| Year | Citations | |
|---|---|---|
Page 1
Page 1