Publication | Closed Access
Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space
29
Citations
34
References
2018
Year
Cluster ComputingEngineeringMachine LearningRange SearchingUnsupervised Machine LearningImage AnalysisInformation RetrievalData ScienceData MiningPattern RecognitionCluster IndexKnowledge DiscoveryComputer ScienceDimensionality ReductionDeep LearningImage SimilarityQuantization (Signal Processing)Computer VisionProduct QuantizationSimilarity SearchHigh-dimensional Space
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations that encode the cluster index in each subspace have been shown to provide impressive accuracy. In this paper, we explore a simple question: is it best to use all the bit-budget for encoding a cluster index? We have found that as data points are located farther away from the cluster centers, the error of estimated distance becomes larger. To address this issue, we propose a novel compact code representation that encodes both the cluster index and quantized distance between a point and its cluster center in each subspace by distributing the bit-budget. We also propose two distance estimators tailored to our representation. We further extend our method to encode global residual distances in the original space. We have evaluated our proposed methods on benchmarks consisting of GIST, VLAD, and CNN features. Our extensive experiments show that the proposed methods significantly and consistently improve the search accuracy over other tested techniques. This result is achieved mainly because our methods accurately estimate distances.
| Year | Citations | |
|---|---|---|
Page 1
Page 1