Publication | Closed Access
Just-in-time language modelling
52
Citations
5
References
2002
Year
Unknown Venue
EngineeringSemanticsLarge Language ModelCorpus LinguisticsText MiningNatural Language ProcessingSyntaxJust-in-time Language ModellingInformation RetrievalData ScienceComputational LinguisticsLanguage EngineeringLanguage StudiesLanguage ModelsMachine TranslationMarginal ProbabilitiesLanguage ModellingNlp TaskLanguage TechnologyDistributional SemanticsLinguistics
Traditional approaches to language modelling have relied on a fixed corpus of text to inform the parameters of a probability distribution over word sequences. Increasing the corpus size often leads to better-performing language models, but no matter how large, the corpus is a static entity, unable to reflect information about events which postdate it. We introduce an online paradigm which interleaves the estimation and application of a language model. We present a Bayesian approach to online language modelling, in which the marginal probabilities of a static trigram model are dynamically updated to match the topic being dictated to the system. We also describe the architecture of a prototype we have implemented which uses the World Wide Web (WWW) as a source of information, and provide results from some initial proof of concept experiments.
| Year | Citations | |
|---|---|---|
Page 1
Page 1