Publication | Closed Access
Robust speaker recognition based on DNN/i-vectors and speech separation
39
Citations
18
References
2017
Year
Unknown Venue
EngineeringMachine LearningSpeech RecognitionPattern RecognitionSpeaker IdentificationPhoneticsRobust Speech RecognitionHealth SciencesRobust Speaker RecognitionNoisy SpeechComputer ScienceDeep LearningDistant Speech RecognitionDeep Neural NetworkSignal ProcessingSpeech CommunicationMulti-speaker Speech RecognitionSpeech ProcessingSpeech SeparationSpeech PerceptionSpeaker Recognition
Recent research shows that the i-vector framework for speaker recognition can significantly benefit from phonetic information. A common approach is to use a deep neural network (DNN) trained for automatic speech recognition to generate a universal background model (UBM). Studies in this area have been done in relatively clean conditions. However, strong background noise is known to severely reduce speaker recognition performance. This study investigates a phonetically-aware i-vector system in noisy conditions. We propose a front-end to tackle the noise problem by performing speech separation and examine its performance for both verification and identification tasks. The proposed separation system trains a DNN to estimate the ideal ratio mask of the noisy speech. The separated speech is then used to extract enhanced features for the i-vector framework. We compare the proposed system against a multi-condition trained baseline and a traditional GMM-UBM i-vector system. Our proposed system provides an absolute average improvement of 8% in identification accuracy and 1.2% in equal error rate.
| Year | Citations | |
|---|---|---|
Page 1
Page 1