Concepedia

TLDR

The main objective of transfer in reinforcement learning is to reduce the complexity of learning the solution of a target task by effectively reusing the knowledge retained from solving a set of source tasks. The paper introduces a novel algorithm that transfers samples from source to target tasks. The method selects samples from source tasks that are similar to the target task, based on comparable transition models and reward functions, and uses them as input for batch reinforcement‑learning algorithms. Empirical results show that the proposed sample‑transfer approach reduces the number of target‑task samples required and lowers learning complexity, even when some source tasks differ significantly from the target.

Abstract

The main objective of transfer in reinforcement learning is to reduce the complexity of learning the solution of a target task by effectively reusing the knowledge retained from solving a set of source tasks. In this paper, we introduce a novel algorithm that transfers samples (i.e., tuples 〈s, a, s', r〉) from source to target tasks. Under the assumption that tasks have similar transition models and reward functions, we propose a method to select samples from the source tasks that are mostly similar to the target task, and, then, to use them as input for batch reinforcement-learning algorithms. As a result, the number of samples an agent needs to collect from the target task to learn its solution is reduced. We empirically show that, following the proposed approach, the transfer of samples is effective in reducing the learning complexity, even when some source tasks are significantly different from the target task.

References

YearCitations

Page 1