Publication | Closed Access
A learning theory approach to noninteractive database privacy
256
Citations
27
References
2013
Year
Privacy ProtectionEngineeringInformation SecurityComputational ComplexityInterval QueriesFormal VerificationData ScienceData MiningPrivacy SystemPrivacy-preserving CommunicationData ManagementLearning Theory ApproachSynthetic DatabasesData PrivacyPrivate Information RetrievalComputer ScienceDifferential PrivacyPrivacyData SecurityQuery OptimizationCryptographySynthetic DataFormal Methods
The authors aim to demonstrate that, ignoring computational constraints, synthetic databases can be released that preserve differential privacy while accurately answering large classes of queries, and to develop a relaxed, efficient algorithm for halfspace queries. They present a net‑based synthetic data mechanism whose error scales with the size of the smallest net approximating query answers, a polynomial‑time privacy‑preserving algorithm that accurately answers halfspace queries for a perturbed query, and efficient synthetic data algorithms for interval and axis‑aligned rectangle queries on discrete domains. The mechanism achieves error guarantees for counting queries that depend only on the VC‑dimension, which grows logarithmically with the query class size; they also prove that no worst‑case utility mechanism exists for simple query classes over continuous domains, and that their relaxed algorithm provides accurate answers for halfspace queries.
In this article, we demonstrate that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy. Specifically, we give a mechanism that privately releases synthetic data useful for answering a class of queries over a discrete domain with error that grows as a function of the size of the smallest net approximately representing the answers to that class of queries. We show that this in particular implies a mechanism for counting queries that gives error guarantees that grow only with the VC-dimension of the class of queries, which itself grows at most logarithmically with the size of the query class. We also show that it is not possible to release even simple classes of queries (such as intervals and their generalizations) over continuous domains with worst-case utility guarantees while preserving differential privacy. In response to this, we consider a relaxation of the utility guarantee and give a privacy preserving polynomial time algorithm that for any halfspace query will provide an answer that is accurate for some small perturbation of the query. This algorithm does not release synthetic data, but instead another data structure capable of representing an answer for each query. We also give an efficient algorithm for releasing synthetic data for the class of interval queries and axis-aligned rectangles of constant dimension over discrete domains.
| Year | Citations | |
|---|---|---|
Page 1
Page 1