Publication | Closed Access
Speech Recognition with Primarily Temporal Cues
3.1K
Citations
19
References
1995
Year
Temporal EnvelopesSpeech AnalysisHealth SciencesNeurolinguisticsPerfect Speech RecognitionRobust Speech RecognitionDynamic Temporal PatternSpeech ProcessingVoice RecognitionSpeech InputLanguage StudiesPrimarily Temporal CuesSpeech PerceptionSignal ProcessingLinguisticsSpeech CommunicationSpeech TechnologySpeech Recognition
Speech recognition can be nearly perfect even when spectral information is greatly reduced. The study extracted temporal envelopes from broad frequency bands and used them to modulate matching‑band noises, preserving envelope cues while severely limiting spectral energy distribution. Recognition of consonants, vowels, and words improved with more bands, achieving high performance with only three modulated‑noise bands, demonstrating that a dynamic temporal pattern in few broad spectral regions suffices for speech recognition.
Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.
| Year | Citations | |
|---|---|---|
Page 1
Page 1