Computing Gaussian Mixture Models with EM Using Equivalence Constraints

TLDR

Density estimation with Gaussian Mixture Models is a popular generative technique used also for clustering. The study develops a framework to incorporate side information as equivalence constraints into Gaussian Mixture Model estimation. Equivalence constraints, defined on data‑point pairs to indicate same or different sources, can be automatically gathered or used as supervision, and the authors provide a closed‑form EM algorithm for positive constraints and a Generalized EM with a Markov net for negative constraints. On public datasets, the proposed method substantially improves clustering performance and outperforms two other methods that also use equivalence constraints.

Abstract

Density estimation with Gaussian Mixture Models is a popular generative technique used also for clustering. We develop a framework to incorporate side information in the form of equivalence constraints into the model estimation procedure. Equivalence constraints are defined on pairs of data points, indicating whether the points arise from the same source (positive constraints) or from different sources (negative constraints). Such constraints can be gathered automatically in some learning problems, and are a natural form of supervision in others. For the estimation of model parameters we present a closed form EM procedure which handles positive constraints, and a Generalized EM procedure using a Markov net which handles negative constraints. Using publicly available data sets we demonstrate that such side information can lead to considerable improvement in clustering tasks, and that our algorithm is preferable to two other suggested methods using the same type of side information.

References

Page 1

	Year	Citations

Page 1