Publication | Closed Access
A probabilistic framework for feature-based speech recognition
129
Citations
20
References
2002
Year
Unknown Venue
Observation SpaceEngineeringMachine LearningSpoken Language ProcessingSingle SegmentationSpeech RecognitionNatural Language ProcessingPattern RecognitionPhoneticsRobust Speech RecognitionVoice RecognitionLanguage StudiesComputer ScienceNormalization CriterionSpeech CommunicationSpeech TechnologyProbabilistic FrameworkSpeech ProcessingSpeech InputSpeech PerceptionLinguistics
Most current speech recognizers use an observation space which is based on a temporal sequence of "frames" (e.g. Mel-cepstra). There is another class of recognizer which further processes these frames to produce a segment-based network, and represents each segment by fixed-dimensional "features". In such feature-based recognizers, the observation space takes the form of a temporal network of feature vectors, so that a single segmentation of an utterance uses a subset of all possible feature vectors. In this paper, we examine a maximum a-posteriori decoding strategy for feature-based recognizers and develop a normalization criterion that is useful for a segment-based Viterbi or A* search. We report experimental results for the task of phonetic recognition on the TIMIT corpus, where we achieved context-independent and context-dependent (using diphones) results on the core test set of 64.1% and 69.5% respectively.
| Year | Citations | |
|---|---|---|
Page 1
Page 1