Publication | Closed Access
Consensus Cluster Center Guided Latent Multi-Kernel Clustering
25
Citations
43
References
2022
Year
Cluster ComputingBase KernelsDocument ClusteringEngineeringMachine LearningData ScienceData MiningReproducing Kernel MethodKnowledge DiscoveryComputer ScienceUpper BoundConsensus Cluster CenterKernel Method
Existing multi-kernel clustering (MKC) methods usually focus on constructing a fixed dimension consensus-partition from base kernels to demonstrate their superior in integrating complementary information. Despite their success, they still suffer from the following limitations: (1) The size of consensus-partition is always fixed as the upper bound ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$k=n$ </tex-math></inline-formula> ) or lower bound ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$k=c$ </tex-math></inline-formula> ), where <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> , <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$c$ </tex-math></inline-formula> , and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> are the number of samples, clusters, and partition dimension, respectively, resulting in suboptimal partition; <xref ref-type="disp-formula" rid="deqn2" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(2)</xref> The learned consensus-partition cannot make full use of the global distribution information hidden in data. To address these issues, we propose a latent consensus-partition learning framework for MKC, namely <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Consensus Cluster Center Guided Latent Multi-kernel Clustering</i> (C <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> LMC), including two methods, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</i> . <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e</i> ., C <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> LMC <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</sub> and C <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> LMC <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">H</sub> . For C <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> LMC <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</sub> , we flexibly search for a more proper dimension of consensus-partition in a latent embedding space rather than the fixed partition dimension. Meanwhile, the generation of latent consensus-partition is guided by a consensus cluster center of base kernels, such that global distribution information hidden in base kernels can be captured fully. However, C <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> LMC <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</sub> suffers from <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\boldsymbol {\mathcal {O}}(n^{2})$ </tex-math></inline-formula> computational complexity and memory complexity. Thus, we further propose C <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> LMC <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">H</sub> to handle large-scale data by reducing both kinds of complexities to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\boldsymbol {\mathcal {O}}(n)$ </tex-math></inline-formula> . Two solvers with convergence proof are developed to validate our effectiveness, superiority, and efficiency on multiple public datasets with the recent advances.
| Year | Citations | |
|---|---|---|
Page 1
Page 1