Publication | Closed Access
An overview of the SPHINX speech recognition system
438
Citations
41
References
1990
Year
EngineeringSpoken Language ProcessingTriphone ModelsLanguage ProcessingSpeech RecognitionNatural Language ProcessingSpeaker IndependenceLanguage DocumentationComputational LinguisticsRobust Speech RecognitionSpeech InterfaceAutomatic RecognitionVoice RecognitionLanguage StudiesSpoken Language UnderstandingFunction-word-dependent Phone ModelsComputer ScienceSpeech CommunicationSpeech TechnologySpeech AcousticsSpeech ProcessingSpeech InputSpeech PerceptionLinguistics
A description is given of SPHINX, a system that demonstrates the feasibility of accurate, large-vocabulary, speaker-independent, continuous speech recognition. SPHINX is based on discrete hidden Markov models (HMMs) with LPC- (linear-predictive-coding) derived parameters. To provide speaker independence, knowledge was added to these HMMs in several ways: multiple codebooks of fixed-width parameters, and an enhanced recognizer with carefully designed models and word-duration modeling. To deal with coarticulation in continuous speech, yet still adequately represent a large vocabulary, two new subword speech units are introduced: function-word-dependent phone models and generalized triphone models. With grammars of perplexity 997, 60, and 20, SPHINX attained word accuracies of 71, 94, and 96%, respectively, on a 997-word task.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1