Research on Reinforcement Learning Based Warehouse Robot Navigation Algorithm in Complex Warehouse Layout

Abstract

This paper addresses the challenge of efficiently determining the optimal path in complex warehouse layouts while enabling real-time decision-making. We introduce a novel approach that combines Proximal Policy Optimization (PPO) with Dijkstra's algorithm, referred to as Proximal Policy-Dijkstra (PP-D). The PP-D method leverages PPO for effective strategy learning and real-time decision-making, while Dijkstra's algorithm is employed for global optimal path planning, ensuring high navigation accuracy and significantly enhancing path planning efficiency. Specifically, PPO allows robots to swiftly adapt and refine their action strategies in dynamic environments through its stable policy update mechanism, whereas Dijkstra's algorithm provides optimal path planning in static settings. Comparative experiments demonstrate that the PP-D framework outperforms traditional algorithms, particularly in navigation prediction accuracy and system robustness. Notably, in complex warehouse layouts, the PP-D method achieves more precise optimal path identification, minimizing collisions and stagnation. This underscores the reliability and effectiveness of our proposed navigation algorithm for robots in intricate warehouse environments.

References

Page 1

	Year	Citations

Page 1