An analysis of single-layer networks in unsupervised feature learning

TLDR

Unsupervised feature learning has advanced on benchmarks such as NORB and CIFAR through increasingly complex algorithms and deep models. This study demonstrates that simple design choices—particularly the number of hidden nodes—can outweigh algorithmic sophistication or model depth in achieving high performance. The authors evaluate off‑the‑shelf methods (sparse auto‑encoders, sparse RBMs, K‑means, Gaussian mixtures) on CIFAR, NORB, and STL with single‑layer networks, systematically varying receptive field size, hidden node count, stride, and whitening. By maximizing hidden nodes and dense feature extraction, the authors attain state‑of‑the‑art accuracy (79.6 % on CIFAR‑10, 97.2 % on NORB) with a single layer, with K‑means delivering the best performance among the tested algorithms.

Abstract

A great deal of research has focused on algorithms for learning features from unlabeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning algorithms and deep models. In this paper, however, we show that several simple factors, such as the number of hidden nodes in the model, may be more important to achieving high performance than the learning algorithm or the depth of the model. Specifically, we will apply several othe-shelf feature learning algorithms (sparse auto-encoders, sparse RBMs, K-means clustering, and Gaussian mixtures) to CIFAR, NORB, and STL datasets using only singlelayer networks. We then present a detailed analysis of the eect of changes in the model setup: the receptive field size, number of hidden nodes (features), the step-size (“stride”) between extracted features, and the eect of whitening. Our results show that large numbers of hidden nodes and dense feature extraction are critical to achieving high performance—so critical, in fact, that when these parameters are pushed to their limits, we achieve state-of-the-art performance on both CIFAR-10 and NORB using only a single layer of features. More surprisingly, our best performance is based on K-means clustering, which is extremely fast, has no hyperparameters to tune beyond the model structure itself, and is very easy to implement. Despite the simplicity of our system, we achieve accuracy beyond all previously published results on the CIFAR-10 and NORB datasets (79.6% and 97.2% respectively).

References

Page 1

	Year	Citations
A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero, Yee‐Whye Teh Neural Computation	2006	16.2K
Independent component analysis: algorithms and applications Aapo Hyvärinen, Erkki Oja Neural Networks Source SeparationEngineeringData SciencePattern RecognitionSignal Processing	2000	8.7K
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories Svetlana Lazebnik, C. Schmid, Jean Ponce	2006	7.9K
Extracting and composing robust features with denoising autoencoders Pascal Vincent, Hugo Larochelle, Yoshua Bengio, EngineeringMachine LearningAutoencodersRobust FeaturesRobust Feature	2008	7.2K
Emergence of simple-cell receptive field properties by learning a sparse code for natural images Bruno A. Olshausen, David J. Field Nature Sparse RepresentationImage AnalysisMachine VisionEngineeringCellular Neural Network	1996	5.8K
Training Products of Experts by Minimizing Contrastive Divergence Geoffrey E. Hinton Neural Computation Artificial IntelligenceRenormalization TermEngineeringMachine LearningData Science	2002	4.9K
Visual categorization with bags of keypoints G. Csurka	2004	4.1K
A Bayesian Hierarchical Model for Learning Natural Scene Categories Li Fei-Fei, Pietro Perona Natural Language ProcessingNatural Scene CategoriesTraining SetScene AnalysisMachine Vision	2005	3.6K
Linear spatial pyramid matching using sparse coding for image classification Kai Yu, Yihong Gong, Thomas S. Huang 2009 IEEE Conference on Computer Vision and Pattern Recognition EngineeringMachine LearningBiometricsSpatial Pyramid MatchingSupport Vector Machine	2009	2.9K
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations Honglak Lee, Roger Grosse, Rajesh Ranganath, Convolutional Neural NetworkEngineeringMachine LearningDeep Belief NetworksAutoencoders	2009	2.4K

Page 1