Pre-trained Models - Concepedia

About

Pre-trained models is a methodological approach in machine learning involving the initial training of a model on a large, generic dataset to acquire foundational knowledge or representations. This concept investigates the efficacy of leveraging knowledge gained from this initial phase to improve performance on related, often more specialized, tasks through transfer learning. Key characteristics include the development of generalized feature extraction capabilities through training on extensive data, typically in an unsupervised or self-supervised manner. The significance of pre-trained models lies in their ability to substantially reduce the data and computational resources required for subsequent task-specific fine-tuning, leading to improved efficiency and state-of-the-art results across diverse applications.

Top Publications

Rankings shown are based on citation count.

On the Dangers of Stochastic Parrots	2021
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding	2018
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	2019
HuggingFace's Transformers: State-of-the-art Natural Language Processing	2019