Learning Invariant Representations for Reinforcement Learning without Reconstruction

Abstract

We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction. Our goal is to learn representations that both provide for effective downstream control and invariance to task-irrelevant details. Bisimulation metrics quantify behavioral similarity between states in continuous MDPs, which we propose using to learn robust latent representations which encode only the task-relevant information from observations. Our method trains encoders such that distances in latent space equal bisimulation distances in state space. We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks, where the background is replaced with moving distractors and natural videos, while achieving SOTA performance. We also test a first-person highway driving task where our method learns invariance to clouds, weather, and time of day. Finally, we provide generalization results drawn from properties of bisimulation metrics, and links to causal inference.

References

Page 1

	Year	Citations
Human-level control through deep reinforcement learning Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Nature Artificial IntelligenceEngineeringDeep Reinforcement LearningReinforcement Learning (Educational Psychology)Computer Science	2015	28.8K
Representation Learning with Contrastive Predictive Coding Aäron van den Oord, Yazhe Li, Oriol Vinyals arXiv (Cornell University) Artificial IntelligenceStructured PredictionGeometric LearningEngineeringMachine Learning	2018	3.8K
The Kinetics Human Action Video Dataset João Carreira, Karen Simonyan arXiv (Cornell University) Video ClipsMachine LearningEngineeringHuman Action ClassesVideo Retrieval	2017	2.9K
A simple framework for contrastive learning of visual representations Ting Chen TIB Data Manager Early VisionConvolutional Neural NetworkMachine VisionMachine LearningData Science	2024	1.2K
Data-Efficient Image Recognition with Contrastive Predictive Coding Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, arXiv (Cornell University) Convolutional Neural NetworkEngineeringMachine LearningAutoencodersImage Classification	2019	936
DeepMind Control Suite Yuval Tassa, Yotam Doron, Alistair Muldal, arXiv (Cornell University) Artificial IntelligenceConvolutional Neural NetworkMachine VisionEngineeringData Science	2018	521
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, International Conference on Machine Learning Artificial IntelligenceEngineeringMachine LearningDeep LearningDeep Reinforcement Learning	2018	477
Embed to control: a locally Linear Latent dynamics model for control from raw images Manuel Watter, Jost Tobias Springenberg, Joschka Boedecker, EngineeringMachine LearningRaw Pixel ImagesAutoencodersLearning Control	2015	440
CURL: Contrastive Unsupervised Representations for Reinforcement\n Learning Aravind Srinivas, Michael Laskin, Pieter Abbeel arXiv (Cornell University)	2020	394
Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer, arXiv (Cornell University) Artificial IntelligenceEngineeringMachine LearningData ScienceDeep Learning	2018	367

Page 1