Publication | Closed Access
Multi-task learning in deep neural networks for improved phoneme recognition
242
Citations
11
References
2013
Year
Unknown Venue
EngineeringMachine LearningSpeech RecognitionData SciencePhoneticsRobust Speech RecognitionMulti-task LearningError RateVoice RecognitionLanguage StudiesPhone ContextComputer ScienceDeep LearningDeep Neural NetworkSpeech CommunicationDeep Neural NetworksMulti-speaker Speech RecognitionSpeech ProcessingSpeech InputSpeech PerceptionLinguistics
In this paper we demonstrate how to improve the performance of deep neural network (DNN) acoustic models using multi-task learning. In multi-task learning, the network is trained to perform both the primary classification task and one or more secondary tasks using a shared representation. The additional model parameters associated with the secondary tasks represent a very small increase in the number of trained parameters, and can be discarded at runtime. In this paper, we explore three natural choices for the secondary task: the phone label, the phone context, and the state context. We demonstrate that, even on a strong baseline, multi-task learning can provide a significant decrease in error rate. Using phone context, the phonetic error rate (PER) on TIMIT is reduced from 21.63% to 20.25% on the core test set, and surpassing the best performance in the literature for a DNN that uses a standard feed-forward network architecture.
| Year | Citations | |
|---|---|---|
Page 1
Page 1