Publication | Closed Access
IRSTLM: an open source toolkit for handling large scale language models
325
Citations
11
References
2008
Year
Unknown Venue
Llm Fine-tuningEngineeringMachine LearningSpeech CorpusSemanticsLarge Language ModelCorpus LinguisticsText MiningSpeech RecognitionLarge Language ModelsNatural Language ProcessingOpen Source ToolkitData ScienceComputational LinguisticsLanguage EngineeringLanguage StudiesMachine TranslationLinguisticsComputer ScienceNeural Machine TranslationLanguage Model CompressionSpeech ProcessingSpeech Translation
Research in speech recognition and machine translation is boosting the use of large‑scale n‑gram language models. We present an open‑source toolkit that permits efficient handling of language models with billions of n‑grams on conventional machines. The IRSTLM toolkit supports distribution of n‑gram collection and smoothing over a computer cluster, compression through probability quantization, and lazy‑loading of huge language models from disk. IRSTLM has been successfully deployed with the Moses toolkit for statistical machine translation and with the FBK‑irst speech recognition system, and its efficiency was demonstrated on a speech transcription task of Italian political speeches using a 1.1 billion‑four‑gram language model.
Research in speech recognition and machine translation is boosting the use of large scale n-gram language models. We present an open source toolkit that permits to efficiently handle language models with billions of n-grams on conventional machines. The IRSTLM toolkit supports distribution of ngram collection and smoothing over a computer cluster, language model compression through probability quantization, lazy-loading of huge language models from disk. IRSTLM has been so far successfully deployed with the Moses toolkit for statistical machine translation and with the FBK-irst speech recognition system. Efficiency of the tool is reported on a speech transcription task of Italian political speeches using a language model of 1.1 billion four-grams.
| Year | Citations | |
|---|---|---|
Page 1
Page 1