A Survey on Differentially Private Machine Learning [Review Article]

TLDR

Machine learning has achieved remarkable successes across many domains, yet models can leak private information from training data, prompting growing interest in differential privacy as a promising technique to protect individual privacy while preserving model quality. This survey comprehensively reviews differentially private machine learning, categorizing existing approaches into Laplace/Gaussian/exponential and output/objective perturbation mechanisms, and outlines future research directions. The survey details how the former adds calibrated noise to the model and the latter perturbs the output or objective function, including applications to deep learning to address privacy concerns in large‑scale data. It highlights key research challenges related to model utility, privacy guarantees, and application domains.

Abstract

Recent years have witnessed remarkable successes of machine learning in various applications. However, machine learning models suffer from a potential risk of leaking private information contained in training data, which have attracted increasing research attention. As one of the mainstream privacy- preserving techniques, differential privacy provides a promising way to prevent the leaking of individual-level privacy in training data while preserving the quality of training data for model building. This work provides a comprehensive survey on the existing works that incorporate differential privacy with machine learning, so- called differentially private machine learning and categorizes them into two broad categories as per different differential privacy mechanisms: the Laplace/ Gaussian/exponential mechanism and the output/objective perturbation mechanism. In the former, a calibrated amount of noise is added to the non-private model and in the latter, the output or the objective function is perturbed by random noise. Particularly, the survey covers the techniques of differentially private deep learning to alleviate the recent concerns about the privacy of big data contributors. In addition, the research challenges in terms of model utility, privacy level and applications are discussed. To tackle these challenges, several potential future research directions for differentially private machine learning are pointed out.

References

Page 1

	Year	Citations

Page 1