Publication | Closed Access
Librispeech: An ASR corpus based on public domain audio books
5.7K
Citations
22
References
2015
Year
Unknown Venue
EngineeringSpeech CorpusSpoken Language ProcessingCorpus LinguisticsSpeech RecognitionNatural Language ProcessingLanguage DocumentationData ScienceComputational LinguisticsPhoneticsRobust Speech RecognitionSpeech InterfaceVoice RecognitionLanguage StudiesMachine TranslationAudio RetrievalRead English SpeechSpeech CommunicationNew CorpusAudio MiningLibrivox ProjectSpeech ProcessingSpeech InputSpeech PerceptionLinguisticsAsr Corpus
This paper introduces a new corpus of read English speech for training and evaluating speech recognition systems. The LibriSpeech corpus, comprising 1000 hours of 16‑kHz read English speech from LibriVox audiobooks, is released with Kaldi scripts for easy system building. The corpus and accompanying language‑model data are freely available, and acoustic models trained on LibriSpeech achieve lower error rates on WSJ test sets than models trained on WSJ alone.
This paper introduces a new corpus of read English speech, suitable for training and evaluating speech recognition systems. The LibriSpeech corpus is derived from audiobooks that are part of the LibriVox project, and contains 1000 hours of speech sampled at 16 kHz. We have made the corpus freely available for download, along with separately prepared language-model training data and pre-built language models. We show that acoustic models trained on LibriSpeech give lower error rate on the Wall Street Journal (WSJ) test sets than models trained on WSJ itself. We are also releasing Kaldi scripts that make it easy to build these systems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1