Concepedia

Publication | Closed Access

Privacy-preserving <i>k</i> -means clustering over vertically partitioned data

628

Citations

23

References

2003

Year

TLDR

Privacy concerns can prevent data sharing, yet distributed knowledge discovery can yield valid results while safeguarding data disclosure. The study proposes a k‑means clustering method for vertically partitioned data where each site holds different attributes of the same entities. Each site learns the cluster assignment of each entity but learns nothing about the attributes held by other sites.

Abstract

Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. The key is to obtain valid results, while providing guarantees on the (non)disclosure of data. We present a method for k-means clustering when different sites contain different attributes for a common set of entities. Each site learns the cluster of each entity, but learns nothing about the attributes at other sites.

References

YearCitations

1977

49.2K

1998

5.1K

1974

4.5K

1986

3.7K

1987

3.5K

2000

3K

2000

1.7K

2001

1K

2002

1K

1998

999

Page 1