Distance Metric Learning for Large Margin Nearest Neighbor Classification

TLDR

The learning problem reduces to a convex optimization based on the hinge loss, analogous to support vector machines. The study demonstrates learning a Mahalanobis distance metric for k‑nearest neighbor classification via semidefinite programming. The metric is trained by semidefinite programming to ensure that k‑nearest neighbors belong to the same class while different‑class examples are separated by a large margin, and the framework handles multiway classification without modification. On seven data sets of varying size and difficulty, the learned metrics significantly improve k‑nearest neighbor classification, achieving a 1.3 % test error on MNIST.

Abstract

We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification—for example, achieving a test error rate of 1.3% on the MNIST handwritten digits. As in support vector machines (SVMs), the learning problem reduces to a convex optimization based on the hinge loss. Unlike learning in SVMs, however, our framework requires no modification or extension for problems in multiway (as opposed to binary) classification.

References

Page 1

	Year	Citations

Page 1