Publication | Closed Access
Towards Privacy-Preserving Speech Data Publishing
35
Citations
29
References
2018
Year
Unknown Venue
Privacy ProtectionEngineeringInformation SecuritySpeech Data PublishingPrivacy LeakInformation ForensicsCommunicationPseudonymizationSpeech RecognitionHardware SecurityData ScienceData AnonymizationPrivacy SystemData ManagementData PrivacyComputer ScienceDifferential PrivacyPrivacyPrivacy LeakageData SecurityCryptographySpeech ProcessingPrivacy-preserving Data PublishingBig Data
Privacy-preserving data publishing has been a heated research topic in the last decade. Numerous ingenious attacks on users' privacy and defensive measures have been proposed for the sharing of various data, varying from relational data, social network data, spatiotemporal data, to images and videos. Speech data publishing, however, is still untouched in the literature. To fill this gap, we study the privacy risk in speech data publishing and explore the possibilities of performing data sanitization to achieve privacy protection while preserving data utility simultaneously. We formulate this optimization problem in a general fashion and present thorough quantifications of privacy and utility. We analyze the sophisticated impacts of possible sanitization methods on privacy and utility, and also design a novel method - key term perturbation for speech content sanitization. A heuristic algorithm is proposed to personalize the sanitization for speakers to restrict their privacy leak (p-leak limit) while minimizing the utility loss. The simulations of linkage attacks and sanitization on real datasets validate the necessity and feasibility of this work.
| Year | Citations | |
|---|---|---|
Page 1
Page 1