SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

TLDR

Visual category recognition can be framed as measuring perceptual distances to prototype examples, enabling flexible use of color, texture, and shape, but nearest‑neighbor methods suffer high variance with limited data and support‑vector‑machine approaches are computationally expensive. The authors propose a hybrid SVM‑KNN approach that combines nearest‑neighbor and support‑vector‑machine strengths for efficient multiclass visual recognition. The method selects nearest neighbors of a query and trains a local SVM that preserves the distance function among those neighbors. Experiments show the hybrid outperforms nearest‑neighbor and SVM baselines, achieving state‑of‑the‑art accuracy on MNIST, USPS, CUReT, and Caltech‑101, with 59.05 % (±0.56 %) using 15 training images per class and 66.23 % (±0.48 %) with 30.

Abstract

We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.

References

Page 1

	Year	Citations

Page 1