Publication | Closed Access
Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
36
Citations
27
References
2023
Year
Artificial IntelligenceEngineeringMachine LearningGlobal PlanningEducationAutonomous SystemsReinforcement Learning (Educational Psychology)Learning ControlLifelong Reinforcement LearningRobotic NavigationReinforcement Learning (Computer Engineering)Robot LearningCognitive ScienceAutonomous LearningSequential Decision MakingComputer ScienceWorld ModelReward ShapingLocal MinimaDeep Reinforcement LearningRobotics
This paper addresses the application of Deep Reinforcement Learning (DRL) methods in the context of local navigation, i.e., a robot moves towards a goal location in unknown and cluttered workspaces equipped only with limited-range exteroceptive sensors. Collision avoidance policies based on DRL present advantages, but they are quite susceptible to local minima, once their capacity to learn suitable actions is limited to the sensor range. We address this issue by means of reward shaping in actorcritic networks. A dense reward function, that incorporates <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">map information</i> gained in the training stage, is proposed to increase the agent's capacity to decide about the best action. Also, we offer a comparison between the Twin Delayed Deep-Deterministic Policy Gradient (TD3) andSoft Actor-Critic (SAC) algorithms for training our policy. A set of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">sim-to-sim</i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">sim-to-real</i> trials illustrate that our proposed reward shaping outperforms the compared methods in terms of generalization, by arriving at the target at higher rates in maps that are prone to local minima and collisions.
| Year | Citations | |
|---|---|---|
Page 1
Page 1