Publication | Open Access
Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference
101
Citations
38
References
2018
Year
Artificial IntelligenceIncremental LearningEngineeringMachine LearningSequential LearningEducationLifelong Reinforcement LearningData ScienceExperience ReplayMemoryRobot LearningContinual Learning (Lifelong Deep Learning)Learning ProblemCognitive ScienceAutonomous LearningComputer ScienceMinimizing InterferenceDeep LearningGradient AlignmentContinual LearningLearning TheoryTransfer Learning
Lack of performance when it comes to continual learning over non‑stationary data distributions remains a major challenge for scaling neural networks to realistic settings. The authors propose a new conceptualization of continual learning as a trade‑off between transfer and interference, and introduce Meta‑Experience Replay (MER) to optimize this trade‑off via gradient alignment and meta‑learning. MER learns parameters that reduce future‑gradient interference while enhancing future‑gradient transfer, achieved by combining experience replay with meta‑learning. Experiments on lifelong supervised and non‑stationary reinforcement learning tasks show MER consistently outperforms baselines, with the performance gap widening as non‑stationarity increases and replay memory shrinks.
Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.
| Year | Citations | |
|---|---|---|
Page 1
Page 1