Publication | Closed Access
Analyzing noise robustness of MFCC and GFCC features in speaker identification
201
Citations
8
References
2013
Year
Unknown Venue
EngineeringBiometricsSpeech EnhancementGfcc FeaturesAcoustic ModelingMfcc RobustnessSpeech RecognitionSpeaker IdentificationPhoneticsSpeaker DiarizationNoiseAudio AnalysisRobust Speech RecognitionVoice RecognitionNoise RobustnessHealth SciencesDistant Speech RecognitionSignal ProcessingSpeech CommunicationIntrinsic RobustnessAutomatic Speaker RecognitionSpeech ProcessingSpeech PerceptionSpeaker Recognition
Automatic speaker recognition can achieve a high level of performance in matched training and testing conditions. However, such performance drops significantly in mismatched noisy conditions. Recent research indicates that a new speaker feature, gammatone frequency cepstral coefficients (GFCC), exhibits superior noise robustness to commonly used mel-frequency cepstral coefficients (MFCC). To gain a deep understanding of the intrinsic robustness of GFCC relative to MFCC, we design speaker identification experiments to systematically analyze their differences and similarities. This study reveals that the nonlinear rectification accounts for the noise robustness differences primarily. Moreover, this study suggests how to enhance MFCC robustness, and further improve GFCC robustness by adopting a different time-frequency representation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1