Publication | Open Access
Hindsight Experience Replay
352
Citations
38
References
2017
Year
Artificial IntelligenceCognitive ScienceEngineeringMachine LearningReward HackingAutonomous LearningMemoryHindsight Experience ReplaySparse RewardsAction Model LearningSequential Decision MakingComputer ScienceRobot LearningLearning ControlRoboticsPerception-action LoopSocial Sciences
Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). The authors propose Hindsight Experience Replay to enable sample‑efficient learning from sparse binary rewards, aiming to simplify reward engineering and demonstrate its effectiveness on robotic arm manipulation tasks. The method augments any off‑policy RL algorithm by replaying failed trajectories with alternate goals, effectively creating an implicit curriculum, and is evaluated on pushing, sliding, and pick‑and‑place tasks using only binary success rewards. Ablation studies confirm that Hindsight Experience Replay is essential for training in these sparse‑reward environments, and policies trained in simulation transfer successfully to a real robot, completing the tasks.
Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be deployed on a physical robot and successfully complete the task. The video presenting our experiments is available at https://goo.gl/SMrQnI.
| Year | Citations | |
|---|---|---|
Page 1
Page 1