Publication | Closed Access
Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop
96
Citations
21
References
2007
Year
Unknown Venue
MusicEngineeringMachine LearningPhonologyArticulatory Feature-based MethodsSpeech RecognitionData SciencePhoneticsRobust Speech RecognitionVoice RecognitionLanguage StudiesJhu Summer WorkshopAudio-visual Speech RecognitionAf ClassificationComputer ScienceDeep LearningAf StatesDistant Speech RecognitionSpeech CommunicationSpeech TechnologyJohns Hopkins WorkshopSpeech ProcessingSpeech InputSpeech Perception
We report on investigations, conducted at the 2006 Johns Hopkins Workshop, into the use of articulatory features (AFs) for observation and pronunciation models in speech recognition. In the area of observation modeling, we use the outputs of AF classifiers both directly, in an extension of hybrid HMM/neural network models, and as part of the observation vector, an extension of the "tandem" approach. In the area of pronunciation modeling, we investigate a model having multiple streams of AF states with soft synchrony constraints, for both audio-only and audio-visual recognition. The models are implemented as dynamic Bayesian networks, and tested on tasks from the small-vocabulary switchboard (SVitchboard) corpus and the CUAVE audio-visual digits corpus. Finally, we analyze AF classification and forced alignment using a newly collected set of feature-level manual transcriptions.
| Year | Citations | |
|---|---|---|
Page 1
Page 1