Publication | Closed Access
Active query selection for semi-supervised clustering
102
Citations
7
References
2008
Year
Available Prior KnowledgeCluster ComputingActive Query SelectionSemi-supervised ClusteringEngineeringMachine LearningData ScienceData MiningInformation RetrievalPattern RecognitionClustering PerformanceKnowledge DiscoveryDocument ClusteringComputer ScienceSemi-supervised LearningText MiningQuery OptimizationOptimization-based Data Mining
Semi-supervised clustering allows a user to specify available prior knowledge about the data to improve the clustering performance. A common way to express this information is in the form of pair-wise constraints. A number of studies have shown that, in general, these constraints improve the resulting data partition. However, the choice of constraints is critical since improperly chosen constraints might actually degrade the clustering performance. We focus on constraint (also known as query) selection for improving the performance of semi-supervised clustering algorithms. We present an active query selection mechanism, where the queries are selected using a min-max criterion. Experimental results on a variety of datasets, using MPCK-means as the underlying semi-clustering algorithm, demonstrate the superior performance of the proposed query selection procedure.
| Year | Citations | |
|---|---|---|
Page 1
Page 1