Publication | Closed Access
Representative clustering of uncertain data
38
Citations
47
References
2014
Year
Unknown Venue
Cluster ComputingEngineeringUncertain DatabaseUncertain DataSemantic WebUnsupervised Machine LearningData ScienceData MiningUncertainty QuantificationManagementData IntegrationStatisticsDocument ClusteringKnowledge DiscoverySingle ClusteringComputer ScienceMeaningful ClusteringsRepresentative ClusteringFuzzy ClusteringData Modeling
This paper targets the problem of computing meaningful clusterings from uncertain data sets. Existing methods for clustering uncertain data compute a single clustering without any indication of its quality and reliability; thus, decisions based on their results are questionable. In this paper, we describe a framework, based on possible-worlds semantics; when applied on an uncertain dataset, it computes a set of representative clusterings, each of which has a probabilistic guarantee not to exceed some maximum distance to the ground truth clustering, i.e., the clustering of the actual (but unknown) data. Our framework can be combined with any existing clustering algorithm and it is the first to provide quality guarantees about its result. In addition, our experimental evaluation shows that our representative clusterings have a much smaller deviation from the ground truth clustering than existing approaches, thus reducing the effect of uncertainty.
| Year | Citations | |
|---|---|---|
Page 1
Page 1