Publication | Closed Access
On Group Nearest Group Query Processing
36
Citations
16
References
2010
Year
Search OptimizationEngineeringRange SearchingMining MethodsInformation RetrievalData ScienceData MiningData PointCombinatorial OptimizationClustering (Nuclear Physics)Such SubsetsKnowledge DiscoveryComputer ScienceBig Data SearchDistributed Query ProcessingIrrelevant SubsetsQuery OptimizationRelational QueriesComputational ScienceLocal Search (Optimization)Approximate Query AnsweringClustering (Data Mining)Similarity Search
Given a data point set D, a query point set Q, and an integer k, the Group Nearest Group (GNG) query finds a subset ω (|ω| ≤ k)of points from Dsuch that the total distance from all points in Q to the nearest point in ω is not greater than any other subset ω' (|ω'| ≤ k) of points in D. GNG query is a partition-based clustering problem which can be found in many real applications and is NP-hard. In this paper, Exhaustive Hierarchical Combination (EHC) algorithm and Subset Hierarchial Refinement (SHR) algorithm are developed for GNG query processing. While EHC is capable to provide the optimal solution for k = 2, SHR is an efficient approximate approach that combines database techniques with local search heuristic. The processing focus of our approaches is on minimizing the access and evaluation of subsets of cardinality k in D since the number of such subsets is exponentially greater than |D|. To do that, the hierarchical blocks of data points at high level are used to find an intermediate solution and then refined by following the guided search direction at low level so as to prune irrelevant subsets. The comprehensive experiments on both real and synthetic data sets demonstrate the superiority of SHR in terms of efficiency and quality.
| Year | Citations | |
|---|---|---|
Page 1
Page 1