Publication | Closed Access
Simplifying long short-term memory acoustic models for fast training and decoding
94
Citations
18
References
2016
Year
Unknown Venue
EngineeringMachine LearningSpoken Language ProcessingRecurrent Neural NetworkAcoustic ModelingSpeech RecognitionSparse Neural NetworkRobust Speech RecognitionHealth SciencesLarge Ai ModelLstm SimplificationsComputer EngineeringLstm ModelsComputer ScienceFast TrainingDeep LearningNeural Architecture SearchModel CompressionSpeech CommunicationSpeech ProcessingSpeech PerceptionLinguistics
On acoustic modeling, recurrent neural networks (RNNs) using Long Short-Term Memory (LSTM) units have recently been shown to outperform deep neural networks (DNNs) models. This paper focuses on resolving two challenges faced by LSTM models: high model complexity and poor decoding efficiency. Motivated by our analysis of the gates activation and function, we present two LSTM simplifications: deriving input gates from forget gates, and removing recurrent inputs from output gates. To accelerate decoding of LSTMs, we propose to apply frame skipping during training, and frame skipping and posterior copying (FSPC) during decoding. In the experiments, model simplifications reduce the size of LSTM models by 26%, resulting in a simpler model structure. Meanwhile, the application of FSPC speeds up model computation by 2 times during LSTM decoding. All these improvements are achieved at the cost of 1% WER degradation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1