Publication | Closed Access
Coresets and approximate clustering for Bregman divergences
39
Citations
19
References
2009
Year
Document ClusteringApproximate ClusteringEngineeringInformation TheoryVariational AnalysisOptimization ProblemComputational ComplexityGeneralized K-median ProblemStatistical InferenceComputer ScienceCombinatorial OptimizationApproximation TheorySublinear AlgorithmBregman Divergence Dφ
We study the generalized k-median problem with respect to a Bregman divergence Dφ. Given a finite set P ⊆ Rd of size n, our goal is to find a set C of size k such that the sum of errors cost(P, C) = ΣpeP minceC {DΦ(p, c)} is minimized. The Bregman k-median problem plays an important role in many applications, e.g. information theory, statistics, text classification, and speech processing. We give the first coreset construction for this problem for a large subclass of Bregman divergences, including important dissimilarity measures such as the Kullback-Leibler divergence and the Itakura-Saito divergence. Using these coresets, we give a (1 + e)-approximation algorithm for the Bregman k-median problem with running time O (dkn + d22(k/c)thetas;(1) logk+2n). This result improves over the previousely fastest known (1 + e)-approximation algorithm from [1]. Unlike the analysis of most coreset constructions our analysis does not rely on the construction of e-nets. Instead, we prove our results by purely combinatorial means.
| Year | Citations | |
|---|---|---|
Page 1
Page 1