Publication | Closed Access
Long-term feature averaging for speaker recognition
83
Citations
13
References
1977
Year
Speech SciencesMachine LearningEngineeringSpeech KinematicsVoice EvaluationAcoustic ModelingSpeech RecognitionData SciencePattern RecognitionSpeaker DiarizationRobust Speech RecognitionAcoustic AnalysisHealth SciencesParameter VariabilitySpeech AcousticSignal ProcessingSpeech CommunicationPotential BenefitsVoiceLong-term FeatureMulti-speaker Speech RecognitionSpeech AcousticsSpeech ProcessingSpeech PerceptionLinguisticsSpeaker Recognition
The potential benefits of long-term parameter averaging for speaker recognition were investigated. Parameters studied were pitch, gain, and reflection coefficients. Parameter variability was computed over various averaging lengths from one frame averaging (in effect, no averaging) to 1000 frame averaging (about 70 s of speech). It was demonstrated that the between-to-within speaker variance ratio, measured over several speakers, was significantly increased by performing long-term averaging of the parameter sets. The reflection coefficient averages for k <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</inf> and k <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">6</inf> , respectively, were shown to produce the highest variance ratios.
| Year | Citations | |
|---|---|---|
Page 1
Page 1