Publication | Open Access
You Talk Too Much: Limiting Privacy Exposure Via Voice Input
23
Citations
9
References
2019
Year
Unknown Venue
Privacy ProtectionEngineeringInformation SecurityCommunicationArbitrary PhrasesSpeech RecognitionData ScienceConversation AnalysisVoice RecognitionHealth SciencesSpeech SynthesisPrivacy By DesignPrivacy IssueData PrivacySpeech OutputComputer ScienceText-to-speechVoice ModelPrivacySpeech CommunicationData SecurityVoiceCloud ComputingVoice SynthesisSpeech ProcessingVoice TechnologyVoice Interaction
Voice synthesis uses a voice model to synthesize arbitrary phrases. Advances in voice synthesis have made it possible to create an accurate voice model of a targeted individual, which can then in turn be used to generate spoofed audio in his or her voice. Generating an accurate voice model of target's voice requires the availability of a corpus of the target's speech. This paper makes the observation that the increasing popularity of voice interfaces that use cloud-backed speech recognition (e.g., Siri, Google Assistant, Amazon Alexa) increases the public's vulnerability to voice synthesis attacks. That is, our growing dependence on voice interfaces fosters the collection of our voices. As our main contribution, we show that voice recognition and voice accumulation (that is, the accumulation of users' voices) are separable. This paper introduces techniques for locally sanitizing voice inputs before they are transmitted to the cloud for processing. In essence, such methods employ audio processing techniques to remove distinctive voice characteristics, leaving only the information that is necessary for the cloud-based services to perform speech recognition. Our preliminary experiments show that our defenses prevent state-of-the-art voice synthesis techniques from constructing convincing forgeries of a user's speech, while still permitting accurate voice recognition.
| Year | Citations | |
|---|---|---|
Page 1
Page 1