Publication | Open Access
Anonymizing Unstructured Data
23
Citations
24
References
2008
Year
EngineeringInformation SecurityK-anonymity ModelPseudonymizationComputational Social ScienceInformation RetrievalData ScienceData MiningData AnonymizationManagementData IntegrationData ManagementUnstructured DataKnowledge DiscoveryData PrivacyPrivate Information RetrievalComputer SciencePrivacy AnonymityDifferential PrivacyPrivacyMarket-basket DatasetsData SecurityPrivate InformationBig Data
In this paper we consider the problem of anonymizing datasets in which each individual is associated with a set of items that constitute private information about the individual. Illustrative datasets include market-basket datasets and search engine query logs. We formalize the notion of k-anonymity for set-valued data as a variant of the k-anonymity model for traditional relational datasets. We define an optimization problem that arises from this definition of anonymity and provide O(klogk) and O(1)-approximation algorithms for the same. We demonstrate applicability of our algorithms to the America Online query log dataset.
| Year | Citations | |
|---|---|---|
Page 1
Page 1