Class-incremental Learning via Deep Model Consolidation

TLDR

Deep neural networks suffer catastrophic forgetting during incremental learning, and existing approaches bias toward old or new classes unless exemplars of old data are used. The authors propose Deep Model Consolidation (DMC) to enable class‑incremental learning without access to original training data. DMC trains a separate model for new classes, then fuses the old and new models through a novel double‑distillation objective using publicly available unlabeled auxiliary data. DMC achieves markedly better performance than state‑of‑the‑art methods on CIFAR‑100, CUB‑200, and PASCAL VOC 2007 in the single‑headed IL setting, overcoming the challenges posed by missing original data.

Abstract

Deep neural networks (DNNs) often suffer from "catastrophic forgetting" during incremental learning (IL) — an abrupt degradation of performance on the original set of classes when the training objective is adapted to a newly added set of classes. Existing IL approaches tend to produce a model that is biased towards either the old classes or new classes, unless with the help of exemplars of the old data. To address this issue, we propose a class-incremental learning paradigm called Deep Model Consolidation (DMC), which works well even when the original training data is not available. The idea is to first train a separate model only for the new classes, and then combine the two individual models trained on data of two distinct set of classes (old classes and new classes) via a novel double distillation training objective. The two existing models are consolidated by exploiting publicly available unlabeled auxiliary data. This overcomes the potential difficulties due to unavailability of original training data. Compared to the state-of-the-art techniques, DMC demonstrates significantly better performance in image classification (CIFAR-100 and CUB-200) and object detection (PASCAL VOC 2007) in the single-headed IL setting.

References

Page 1

	Year	Citations

Page 1