Publication | Closed Access
Towards Better Decoding and Language Model Integration in Sequence to Sequence Models
327
Citations
26
References
2017
Year
Unknown Venue
EngineeringMachine LearningLanguage Model IntegrationSpoken Language ProcessingLarge Language ModelCorpus LinguisticsSpeech RecognitionNatural Language ProcessingSyntaxData ScienceComputational LinguisticsGrammarLanguage StudiesLanguage ModelsReal-time LanguageMachine TranslationSequence ModellingTowards Better DecodingLinguisticsDeep LearningSequence ModelsTrigram Language ModelSpeech CommunicationNeural Machine TranslationSpeech ProcessingSpeech InputFramework AdvocatesSpeech TranslationLanguage Generation
The recently proposed Sequence-to-Sequence (seq2seq) framework advocates replacing complex data processing pipelines, such as an entire automatic speech recognition system, with a single neural network trained in an end-to-end fashion.In this contribution, we analyse an attention-based seq2seq speech recognition system that directly transcribes recordings into characters.We observe two shortcomings: overconfidence in its predictions and a tendency to produce incomplete transcriptions when language models are used.We propose practical solutions to both problems achieving competitive speaker independent word error rates on the Wall Street Journal dataset: without separate language models we reach 10.6% WER, while together with a trigram language model, we reach 6.7% WER.
| Year | Citations | |
|---|---|---|
Page 1
Page 1