Aggregate Query Answering on Anonymized Tables

TLDR

Privacy concerns arise when microdata are released for ad hoc analyses, and existing privacy models such as k‑anonymity and l‑diversity are designed only for categorical attributes; moreover, generalization‑based anonymization cannot support accurate aggregate queries while the need for such analyses persists. The study proposes privacy goals tailored to numerical sensitive attributes and introduces a permutation‑based anonymization framework to enable accurate aggregate query answering. The authors develop a permutation‑based anonymization framework, define optimization criteria for accurate aggregate queries, and provide efficient algorithms for each criterion.

Abstract

Privacy is a serious concern when microdata need to be released for ad hoc analyses. The privacy goals of existing privacy protection approaches (e.g., k-anonymity and l-diversity) are suitable only for categorical sensitive attributes. Since applying them directly to numerical sensitive attributes (e.g., salary) may result in undesirable information leakage, we propose privacy goals to better capture the need of privacy protection for numerical sensitive attributes. Complementing the desire for privacy is the need to support ad hoc aggregate analyses over microdata. Existing generalization-based anonymization approaches cannot answer aggregate queries with reasonable accuracy. We present a general framework of permutation-based anonymization to support accurate answering of aggregate queries and show that, for the same grouping, permutation-based techniques can always answer aggregate queries more accurately than generalization-based approaches. We further propose several criteria to optimize permutations for accurate answering of aggregate queries, and develop efficient algorithms for each criterion.

References

Page 1

	Year	Citations

Page 1