Publication | Closed Access
Toward a Universal Synthetic Speech Spoofing Detection Using Phase Information
86
Citations
50
References
2015
Year
Synthetic VoiceEngineeringBiometricsSpeech RecognitionSpeech CodingPhoneticsRobust Speech RecognitionBlizzard ChallengeVoice RecognitionHealth SciencesSpeech SynthesisComputer ScienceDistant Speech RecognitionSignal ProcessingSpeech CommunicationGaussian Mixture ModelSpeech ProcessingSpeech PerceptionSpeaker Recognition
In the field of speaker verification (SV) it is nowadays feasible and relatively easy to create a synthetic voice to deceive a speech driven biometric access system. This paper presents a synthetic speech detector that can be connected at the front-end or at the back-end of a standard SV system, and that will protect it from spoofing attacks coming from state-of-the-art statistical Text to Speech (TTS) systems. The system described is a Gaussian Mixture Model (GMM) based binary classifier that uses natural and copy-synthesized signals obtained from the Wall Street Journal database to train the system models. Three different state-of-the-art vocoders are chosen and modeled using two sets of acoustic parameters: 1) relative phase shift and 2) canonical Mel Frequency Cepstral Coefficients (MFCC) parameters, as baseline. The vocoder dependency of the system and multivocoder modeling features are thoroughly studied. Additional phase-aware vocoders are also tested. Several experiments are carried out, showing that the phase-based parameters perform better and are able to cope with new unknown attacks. The final evaluations, testing synthetic TTS signals obtained from the Blizzard challenge, validate our proposal.
| Year | Citations | |
|---|---|---|
Page 1
Page 1