Publication | Closed Access
High-confidence near-duplicate image detection
46
Citations
23
References
2012
Year
Unknown Venue
EngineeringMachine LearningImage RetrievalBiometricsImage SearchImage ForensicsImage AnalysisInformation RetrievalData ScienceData MiningPattern RecognitionNear-duplicate Image DetectionHigh ConfidenceAmbiguous Sift FeaturesMachine VisionKnowledge DiscoveryComputer ScienceImage SimilarityComputer VisionSpatial VerificationSimilarity SearchContent-based Image Retrieval
In this paper, we propose two techniques for near-duplicate image detection at high confidence and large scale. First, we show that entropy-based filtering eliminates ambiguous SIFT features that cause most of the false positives, and enables claiming near-duplicity with a single match of the retained high-quality features. Second, we show that graph cut can be used for query expansion with a duplicity graph computed offline to substantially improve search quality. Evaluation with web images show that when combined with sketch embedding [6], our methods achieve false positive rate orders of magnitude lower than the standard visual word approach. We demonstrate the proposed techniques with a large-scale image search engine which, using indexing data structure offline computed with a Hadoop cluster, is capable of serving more than 50 million web images with a single commodity server.
| Year | Citations | |
|---|---|---|
Page 1
Page 1