Publication | Open Access
Improving neural networks by preventing co-adaptation of feature detectors
6.6K
Citations
16
References
2012
Year
Feature DetectorLarge Ai ModelConvolutional Neural NetworkMachine VisionMachine LearningData ScienceEngineeringPattern RecognitionFeature LearningMachine Learning ModelSparse Neural NetworkSmall TrainingSpeech ProcessingComputer ScienceDeep LearningNeural Architecture SearchFeature DetectorsSpeech Recognition
Large feedforward neural networks trained on small datasets often overfit and perform poorly on unseen data. Dropout randomly omits half of the feature detectors during training, preventing co‑adaptations and forcing each neuron to learn features useful across many internal contexts. Dropout dramatically reduces overfitting, yielding large performance gains and new state‑of‑the‑art results on speech and object recognition benchmarks.
When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.
| Year | Citations | |
|---|---|---|
Page 1
Page 1