Concepedia

Publication | Closed Access

Reinforcement learning with perceptual aliasing: the perceptual distinctions approach

319

Citations

11

References

1992

Year

Lonnie Chrisman

Unknown Venue

TLDR

Perceptual aliasing, where indistinguishable percepts require different actions, can severely impair reinforcement learning, as illustrated by a robot that cannot see a battery charger behind it. This work introduces the predictive distinctions approach to compensate for perceptual aliasing in reinforcement learning. The approach augments the control policy with a probabilistic predictive model that learns to track unobservable aspects of the environment from experience. Experiments in a simple simulated domain show that the system autonomously discovers critical distinctions in the world, demonstrating the feasibility of the method.

Abstract

It is known that Perceptual Aliasing may significantly diminish the effectiveness of reinforcement learning algorithms [Whitehead and Ballard, 1991]. Perceptual aliasing occurs when multiple situations that are indistinguishable from immediate perceptual input require different responses from the system. For example, if a robot can only see forward, yet the presence of a battery charger behind it determines whether or not it should backup, immediate perception alone is insufficient for determining the most appropriate action. It is problematic since reinforcement algorithms typically learn a control policy from immediate perceptual input to the optimal choice of action. This paper introduces the predictive distinctions approach to compensate for perceptual aliasing caused from incomplete perception of the world. An additional component, a predictive model, is utilized to track aspects of the world that may not be visible at all times. In addition to the control policy, the model must also be learned, and to allow for stochastic actions and noisy perception, a probabilistic model is learned from experience. In the process, the system must discover, on its own, the important distinctions in the world. Experimental results are given for a simple simulated domain, and additional issues are discussed.

References

YearCitations

Page 1