Publication | Closed Access
A linguistic feature representation of the speech waveform
29
Citations
15
References
1993
Year
Speech SciencesSpeech CorpusSpoken Language ProcessingPhonologySpeech RecognitionPhoneticsComputational LinguisticsAudio Signal AnalysisAutomatic RecognitionLinguistic Theory ViewsLanguage StudiesAcoustic AnalysisSpeech Signal AnalysisSpoken Language UnderstandingHealth SciencesSpeech WaveformSpeech CommunicationSpeech TechnologySpeech AnalysisShorthand NotationSpeech AcousticsSpeech ProcessingSpeech InputSpeech PerceptionLinguistics
Linguistic theory views a phoneme as a shorthand notation for a bundle of binary features related to the operation of the speaker's articulators. A representation of the speech waveform in terms of these underlying distinctive features is described here. The estimation of the probability of each of 14 linguistic features being encoded locally in the waveform is performed on a frame-by-frame basis. In going from the abstract to the physical level, it is recognized that the features are encoded in the waveform hierarchically and that time-varying manifestations of a feature within a phonemic segment are possible. These issues are addressed simultaneously through a two-stage procedure. In the first pass, the time portion and broad class of sound being represented by each frame are estimated. On the second pass, for each distinctive linguistic feature, models built explicitly for the estimated broad class portion are evaluated to arrive at the probability that each frame is part of a realization of a phoneme in which the feature is present. The distinctive feature representation is applied to the tasks of phoneme recognition and secondary classification in keyword spotting.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1