Learning Invariant Representations for Reinforcement Learning without\n Reconstruction

Abstract

We study how representation learning can accelerate reinforcement learning\nfrom rich observations, such as images, without relying either on domain\nknowledge or pixel-reconstruction. Our goal is to learn representations that\nboth provide for effective downstream control and invariance to task-irrelevant\ndetails. Bisimulation metrics quantify behavioral similarity between states in\ncontinuous MDPs, which we propose using to learn robust latent representations\nwhich encode only the task-relevant information from observations. Our method\ntrains encoders such that distances in latent space equal bisimulation\ndistances in state space. We demonstrate the effectiveness of our method at\ndisregarding task-irrelevant information using modified visual MuJoCo tasks,\nwhere the background is replaced with moving distractors and natural videos,\nwhile achieving SOTA performance. We also test a first-person highway driving\ntask where our method learns invariance to clouds, weather, and time of day.\nFinally, we provide generalization results drawn from properties of\nbisimulation metrics, and links to causal inference.\n