Reinforcement Learning through Asynchronous Advantage Actor-Critic on a\n GPU

Abstract

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage\nActor-Critic (A3C) algorithm, currently the state-of-the-art method in\nreinforcement learning for various gaming tasks. We analyze its computational\ntraits and concentrate on aspects critical to leveraging the GPU's\ncomputational power. We introduce a system of queues and a dynamic scheduling\nstrategy, potentially helpful for other asynchronous algorithms as well. Our\nhybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant\nspeed up compared to a CPU implementation; we make it publicly available to\nother researchers at https://github.com/NVlabs/GA3C .\n

References

Page 1

	Year	Citations

Page 1