Publication | Closed Access
SNR-dependent non-uniform spectral compression for noisy speech recognition
15
Citations
8
References
2004
Year
Unknown Venue
EngineeringSpeech EnhancementPerceived LoudnessAcoustic ModelingSpeech RecognitionSpeech CodingNoiseRobust Speech RecognitionSpeech Signal AnalysisHealth SciencesStandard MfccDistant Speech RecognitionSignal ProcessingSpeech CommunicationSpeech ProcessingSpeech SeparationSpeech PerceptionMasked Loudness FunctionNoisy Speech Recognition
It is known that the perceived loudness of a tone signal by a human is spectrally masked by background noise. This masking effect causes not only a shift of just-audible sound pressure level of the tone, but also produces a masked loudness function having steeper slope than the unmasked one. This masking property of perceived loudness stimulates us to propose a new mel-scale-based feature extraction method with non-uniform spectral compression for speech recognition in noisy environments. In this method, the speech power spectrum is to undergo mel-scaled band-pass filtering, as in the standard MFCC front-end. However, the energies of the outputs of the filters are compressed by different root values defined by a compression function. This compression function is a function of the SNR in each filter band. Using this new scheme of SNR-dependent non-uniform spectral compression (SNSC) for mel-scaled filter-bank-based cepstral coefficients, substantial improvement is found for recognition in different noisy environments, as compared to the standard MFCC and features derived with cubic root spectral compression.
| Year | Citations | |
|---|---|---|
Page 1
Page 1