Publication | Closed Access
A tree-based statistical language model for natural language speech recognition
253
Citations
10
References
1989
Year
EngineeringSpeech CorpusBinary Decision TreeSpoken Language ProcessingLanguage ProcessingText MiningSpeech RecognitionNatural Language ProcessingComputational LinguisticsNext WordLanguage EngineeringGrammarLanguage StudiesSpoken Language UnderstandingMachine TranslationTree LanguageNlp TaskLanguage TechnologyLanguage Modeling (Natural Language Processing)Speech CommunicationEquivalent Trigram ModelTreebanksLanguage RecognitionSpeech ProcessingLanguage Modeling (Theoretical Linguistics)Speech PerceptionLinguistics
The problem of predicting the next word a speaker will say, given the words already spoken; is discussed. Specifically, the problem is to estimate the probability that a given word will be the next word uttered. Algorithms are presented for automatically constructing a binary decision tree designed to estimate these probabilities. At each node of the tree there is a yes/no question relating to the words already spoken, and at each leaf there is a probability distribution over the allowable vocabulary. Ideally, these nodal questions can take the form of arbitrarily complex Boolean expressions, but computationally cheaper alternatives are also discussed. Some results obtained on a 5000-word vocabulary with a tree designed to predict the next word spoken from the preceding 20 words are included. The tree is compared to an equivalent trigram model and shown to be superior.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1