Publication | Open Access
PRIVACY PRESERVING k-MEANS CLUSTERING IN MULTI-PARTY ENVIRONMENT
37
Citations
13
References
2007
Year
Unknown Venue
EngineeringMachine LearningInformation SecurityCluster AnalysisData Mining SecurityMining MethodsKnowledge Discovery In DatabasesData ScienceData MiningPrivacy SystemPrivacy-preserving CommunicationK-means ClusteringData ManagementClustering (Nuclear Physics)Privacy ServiceKnowledge DiscoveryData PrivacyPrivate Information RetrievalComputer ScienceDifferential PrivacyPrivacyData SecurityCryptographyClustering (Data Mining)Security Data MiningBig Data
Data mining often extracts knowledge from databases, but modern databases are distributed across multiple parties, each wishing to preserve privacy, making privacy‑preserving techniques essential for tasks such as cluster analysis, notably k‑means clustering, which partitions data into meaningful groups. This paper proposes privacy‑preserving protocols for k‑means clustering, including a secure comparison sub‑protocol for the Millionaires’ Problem, to enable clustering of horizontally or vertically partitioned data among multiple parties. The protocols employ secure multi‑party computation to perform k‑means clustering on partitioned data while preserving each party’s privacy, leveraging a secure comparison sub‑protocol to handle the Millionaires’ Problem.
Extracting meaningful and valuable knowledge from databases is often done by various data mining algorithms. Nowadays, databases are distributed among two or more parties because of different reasons such as physical and geographical restrictions and the most important issue is privacy. Related data is normally maintained by more than one organization, each of which wants to keep its individual information private. Thus, privacy-preserving techniques and protocols are designed to perform data mining on distributed environments when privacy is highly concerned. Cluster analysis is a technique in data mining, by which data can be divided into some meaningful clusters, and it has an important role in different fields such as bio-informatics, marketing, machine learning, climate and medicine. k-means Clustering is a prominent algorithm in this category which creates a one-level clustering of data. In this paper we introduce privacy-preserving protocols for this algorithm, along with a protocol for Secure comparison, known as the Millionaires’ Problem, as a sub-protocol, to handle the clustering of horizontally or vertically partitioned data among two or more parties.
| Year | Citations | |
|---|---|---|
Page 1
Page 1