Approximating K‐means‐type Clustering via Semidefinite Programming

Abstract

One of the fundamental clustering problems is to assign n points into k clusters based on minimal sum‐of‐squared distances (MSSC), which is known to be NP‐hard. In this paper, by using matrix arguments, we first model MSSC as a so‐called 0‐1 semidefinite programming (SDP) problem. We show that our 0‐1 SDP model provides a unified framework for several clustering approaches such as normalized k‐cut and spectral clustering. Moreover, the 0‐1 SDP model allows us to solve the underlying problem approximately via the linear programming and SDP relaxations. Second, we consider the issue of how to extract a feasible solution of the original 0‐1 SDP model from the optimal solution of the relaxed SDP problem. By using principal component analysis, we develop a rounding procedure to construct a feasible partitioning from a solution of the relaxed problem. In our rounding procedure, we need to solve a K‐means clustering problem in $\Re^{k-1}$, which can be done in $O(n^{k^2-2k+2})$ time. In case of biclustering, the running time of our rounding procedure can be reduced to $O(n\log n)$. We show that our algorithm provides a 2–approximate solution to the original problem. Promising numerical results for biclustering based on our new method are reported.

References

Page 1

	Year	Citations

Page 1