Robust Object Recognition with Cortex-Like Mechanisms

TLDR

The framework relies on a universal, redundant dictionary of features capable of recognizing most object categories. The study introduces a biologically motivated hierarchical system that alternates template matching and maximum pooling to build invariant feature representations for complex visual scene recognition. The authors demonstrate the system on diverse tasks, from cluttered single‑object recognition to multiclass categorization and complex scene understanding that requires both shape‑ and texture‑based object recognition. The approach learns from few examples, competes with state‑of‑the‑art systems, and provides a plausibility proof for feedforward cortical object‑recognition models.

Abstract

We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.

References

Page 1

	Year	Citations
Gradient-based learning applied to document recognition Yann LeCun, Léon Bottou, Yoshua Bengio, Proceedings of the IEEE EngineeringMachine LearningMultilayer Neural NetworksImage AnalysisData Science	1998	56.5K
Histograms of Oriented Gradients for Human Detection Navneet Dalal, Bill Triggs EngineeringFeature DetectionMachine LearningBiometricsOriented Gradients	2005	31.6K
Object recognition from local scale-invariant features David Lowe EngineeringFeature DetectionBiometricsLocalizationRobust Feature	1999	16.1K
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex David H. Hubel, T. N. Wiesel The Journal of Physiology Early VisionCognitive ScienceComputational NeuroscienceFunctional ArchitectureReceptive Fields	1962	13.7K
A feature-integration theory of attention Anne Treisman, Garry A. Gelade Cognitive Psychology Cognitive ScienceNeurolinguisticsFeature-integration TheorySelective AttentionCognition	1980	12.3K
Shape matching and object recognition using shape contexts Serge Belongie, Jitendra Malik, Jan Puzicha IEEE Transactions on Pattern Analysis and Machine Intelligence EngineeringBiometricsShape AnalysisSimilar ShapesImage Analysis	2002	6.3K
Emergence of simple-cell receptive field properties by learning a sparse code for natural images Bruno A. Olshausen, David J. Field Nature Sparse RepresentationImage AnalysisMachine VisionEngineeringCellular Neural Network	1996	5.8K
Theory of communication Деннис Габор Journal of the IEE Computational CommunicationHuman CommunicationCommunication StrategyCommunicationArts	1946	4.5K
Learning a Similarity Metric Discriminatively, with Application to Face Verification Sumit Chopra, Raia Hadsell, Yann LeCun EngineeringMachine LearningBiometricsTraining SamplesFace Detection	2005	3.9K
Speed of processing in the human visual system Simon J. Thorpe, Denis Fize, Catherine Marlot Nature Early VisionCognitive ScienceEye TrackingHuman Visual SystemNeuroscience	1996	3.5K

Page 1