Publication | Open Access
A continual learning survey: Defying forgetting in classification tasks
1.5K
Citations
93
References
2021
Year
Artificial IntelligenceIncremental LearningEngineeringMachine LearningSequential LearningEducationData ScienceMemoryMulti-task LearningContinual Learning SurveyRobot LearningContinual Learning (Lifelong Deep Learning)Retrieval TechniqueCognitive ScienceTiny ImagenetComputer ScienceDeep LearningKnowledge DistillationArtificial Neural NetworksDropout RegularizationContinual Learning (Educational Psychology)
Artificial neural networks excel at fixed classification tasks but suffer catastrophic forgetting when extending knowledge beyond the original task; continual learning aims to allow networks to accumulate knowledge across sequential tasks without retraining. This survey surveys continual learning for classification, presenting a taxonomy, a framework for stability‑plasticity trade‑off, and a comprehensive experimental comparison of 11 methods with baselines. The authors evaluate task‑incremental classification on Tiny ImageNet, iNaturalist, and a sequence of recognition datasets, studying how model capacity, weight decay, dropout, and task order affect memory, computation time, and storage requirements.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern: (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods; and (4) baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.
| Year | Citations | |
|---|---|---|
Page 1
Page 1