Optimization as a Model for Few-Shot Learning

TLDR

Deep neural networks excel with abundant data but struggle on few‑shot tasks, where gradient‑based optimization typically requires many iterative steps over many examples. The authors propose an LSTM‑based meta‑learner that learns the exact optimization algorithm for training a learner network in few‑shot settings. Its parameterization enables learning tailored update rules for a fixed number of steps and a general initialization that accelerates convergence. The model achieves performance comparable to deep metric‑learning methods on few‑shot tasks.

Abstract

Though deep neural networks have shown great success in the large data domain, they generally perform poorly on few-shot learning tasks, where a model has to quickly generalize after seeing very few examples from each class. The general belief is that gradient-based optimization in high capacity models requires many iterative steps over many examples to perform well. Here, we propose an LSTM-based meta-learner model to learn the exact optimization algorithm used to train another learner neural network in the few-shot regime. The parametrization of our model allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while also learning a general initialization of the learner network that allows for quick convergence of training. We demonstrate that this meta-learning model is competitive with deep metric-learning techniques for few-shot learning.

References

Page 1

	Year	Citations

Page 1