GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond

TLDR

NLNet captures long‑range dependencies by aggregating query‑specific global context, and its simplified design resembles SENet. The authors aim to build a query‑independent, lightweight network that preserves NLNet accuracy while reducing computation. They propose a three‑step global‑context framework instantiated with a lightweight GC block that efficiently models global context across layers. The GCNet, built from this framework, outperforms both simplified NLNet and SENet on major recognition benchmarks, demonstrating that global contexts are largely identical across query positions.

Abstract

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks.

References

Page 1

	Year	Citations
Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang, Shaoqing Ren, Image ClassificationDeep Neural NetworksMachine VisionImage AnalysisMachine Learning	2016	214.9K
ImageNet classification with deep convolutional neural networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Communications of the ACM Convolutional Neural NetworkEngineeringMachine LearningNeural NetworkImagenet Classification	2017	75.5K
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan, Andrew Zisserman arXiv (Cornell University) Geometric LearningConvolutional Neural NetworkEngineeringMachine LearningConvolutional Network Depth	2014	75.4K
MizAR 60 for Mizar 50 DROPS (Schloss Dagstuhl – Leibniz Center for Informatics)	2023	73.5K
ImageNet: A large-scale hierarchical image database Jia Deng, Wei Dong, Richard Socher, 2009 IEEE Conference on Computer Vision and Pattern Recognition EngineeringMachine LearningImage RetrievalImage DatabaseImage Recognition (Computer Vision)	2009	60.2K
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, IEEE Transactions on Pattern Analysis and Machine Intelligence	2016	52.4K
Going deeper with convolutions Christian Szegedy, Wei Liu, Yangqing Jia, Image ClassificationDeep Neural NetworksImage AnalysisMachine LearningData Science	2015	46.2K
Densely Connected Convolutional Networks Gao Huang, Zhuang Liu, Laurens van der Maaten, Geometric LearningConvolutional Neural NetworkEngineeringMachine LearningDense Convolutional Network	2017	43.3K
Mask R-CNN Kaiming He, Georgia Gkioxari, Piotr Dollár, Object Instance SegmentationScene AnalysisMachine VisionImage AnalysisMachine Learning	2017	27.9K
Feature Pyramid Networks for Object Detection Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Feature Pyramid NetworksConvolutional Neural NetworkImage AnalysisMachine VisionMachine Learning	2017	27.7K

Page 1