Publication | Open Access
An Analysis of Neural Language Modeling at Multiple Scales
142
Citations
20
References
2018
Year
EngineeringNeurolinguisticsPsycholinguisticsMultilingual PretrainingLarge Language ModelCorpus LinguisticsText MiningSingle Modern GpuSpeech RecognitionNatural Language ProcessingSyntaxSpecialized ArchitecturesComputational LinguisticsPenn TreebankLanguage EngineeringLanguage StudiesLanguage ModelsNeural Scaling LawMachine TranslationLarge Ai ModelNlp TaskLanguage NetworkComputer ScienceMultiple ScalesLinguistics
Many of the leading approaches in language modeling introduce novel, complex and specialized architectures. We take existing state-of-the-art word level language models based on LSTMs and QRNNs and extend them to both larger vocabularies as well as character-level granularity. When properly tuned, LSTMs and QRNNs achieve state-of-the-art results on character-level (Penn Treebank, enwik8) and word-level (WikiText-103) datasets, respectively. Results are obtained in only 12 hours (WikiText-103) to 2 days (enwik8) using a single modern GPU.
| Year | Citations | |
|---|---|---|
Page 1
Page 1