Publication | Open Access
Confidence-ranked reconstruction of census microdata from published statistics
23
Citations
9
References
2023
Year
A reconstruction attack on a private dataset <i>D</i> takes as input some publicly accessible information about the dataset and produces a list of candidate elements of <i>D</i>. We introduce a class of data reconstruction attacks based on randomized methods for nonconvex optimization. We empirically demonstrate that our attacks can not only reconstruct full rows of <i>D</i> from aggregate query statistics <i>Q</i>(<i>D</i>)∈ℝ<sup><i>m</i></sup> but can do so in a way that reliably ranks reconstructed rows by their odds of appearing in the private data, providing a signature that could be used for prioritizing reconstructed rows for further actions such as identity theft or hate crime. We also design a sequence of baselines for evaluating reconstruction attacks. Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset <i>D</i> was sampled, demonstrating that they are exploiting information in the aggregate statistics <i>Q</i>(<i>D</i>) and not simply the overall structure of the distribution. In other words, the queries <i>Q</i>(<i>D</i>) are permitting reconstruction of elements of this dataset, not the distribution from which <i>D</i> was drawn. These findings are established both on 2010 US decennial Census data and queries and Census-derived American Community Survey datasets. Taken together, our methods and experiments illustrate the risks in releasing numerically precise aggregate statistics of a large dataset and provide further motivation for the careful application of provably private techniques such as differential privacy.
| Year | Citations | |
|---|---|---|
Page 1
Page 1