Information-theoretic metric learning

TLDR

The paper proposes an information‑theoretic method for learning a Mahalanobis distance metric. The method formulates metric learning as a Bregman optimization problem that minimizes the LogDet divergence between two Gaussians under linear constraints, and includes an online variant with proven regret bounds. The algorithm outperforms existing methods by handling diverse constraints, incorporating priors, being fast and scalable without eigenvalue or SDP computations, and achieving strong performance on the Clarify error‑reporting system and standard datasets.

Abstract

In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the distance function. We express this problem as a particular Bregman optimization problem---that of minimizing the LogDet divergence subject to linear constraints. Our resulting algorithm has several advantages over existing methods. First, our method can handle a wide variety of constraints and can optionally incorporate a prior on the distance function. Second, it is fast and scalable. Unlike most existing methods, no eigenvalue computations or semi-definite programming are required. We also present an online version and derive regret bounds for the resulting algorithm. Finally, we evaluate our method on a recent error reporting system for software called Clarify, in the context of metric learning for nearest neighbor classification, as well as on standard data sets.

References

Page 1

	Year	Citations

Page 1