Publication | Closed Access
Practical Block-Wise Neural Network Architecture Generation
532
Citations
36
References
2018
Year
Unknown Venue
Block-wise GenerationConvolutional Neural NetworkEngineeringMachine LearningComputer ArchitectureImage AnalysisData ScienceSparse Neural NetworkComputer DesignEmbedded Machine LearningRobot LearningParallel ComputingVideo TransformerMachine VisionComputer EngineeringOptimal Network BlockComputer ScienceDeep LearningNeural Architecture SearchComputer VisionCircuit DesignConvolutional Neural NetworksParallel Programming
Convolutional neural networks have achieved remarkable success in computer vision, yet most practical architectures are hand‑crafted and demand expert design. This work introduces BlockQNN, a block‑wise neural network generation pipeline that automatically constructs high‑performance models using Q‑learning with epsilon‑greedy exploration. The method trains a learning agent to sequentially select component layers for an optimal block, stacks these blocks into a full network, and employs a distributed asynchronous framework with early stopping to accelerate search. BlockQNN produces networks that outperform hand‑crafted state‑of‑the‑art models, achieving a 3.54 % top‑1 error on CIFAR‑10, requires only three days on 32 GPUs, and generalizes well to ImageNet.
Convolutional neural networks have gained a remarkable success in computer vision. However, most usable network architectures are hand-crafted and usually require expertise and elaborate design. In this paper, we provide a block-wise network generation pipeline called BlockQNN which automatically builds high-performance networks using the Q-Learning paradigm with epsilon-greedy exploration strategy. The optimal network block is constructed by the learning agent which is trained sequentially to choose component layers. We stack the block to construct the whole auto-generated network. To accelerate the generation process, we also propose a distributed asynchronous framework and an early stop strategy. The block-wise generation brings unique advantages: (1) it performs competitive results in comparison to the hand-crafted state-of-the-art networks on image classification, additionally, the best network generated by BlockQNN achieves 3.54% top-1 error rate on CIFAR-10 which beats all existing auto-generate networks. (2) in the meanwhile, it offers tremendous reduction of the search space in designing networks which only spends 3 days with 32 GPUs, and (3) moreover, it has strong generalizability that the network built on CIFAR also performs well on a larger-scale ImageNet dataset.
| Year | Citations | |
|---|---|---|
Page 1
Page 1