Publication | Open Access
ParseNet: Looking Wider to See Better
1.1K
Citations
28
References
2015
Year
Convolutional Neural NetworkScene AnalysisEngineeringMachine LearningImage AnalysisData ScienceSemantic SegmentationVisual ComputingVideo TransformerConvolutional NetworksMachine VisionObject DetectionNormalization ParametersComputer ScienceDeep LearningComputer VisionVisual CommunicationEye TrackingScene Understanding
The paper proposes adding global context to deep convolutional networks for semantic segmentation. The method augments each location’s features with the average feature of a layer and incorporates training refinements that boost baseline network performance. Adding the global feature and learning normalization parameters consistently improves accuracy, and ParseNet attains state‑of‑the‑art results on SiftFlow and PASCAL‑Context with minimal extra cost, approaching current best on PASCAL VOC 2012. Code is available at https://github.com/weiliu89/caffe/tree/fcn.
We present a technique for adding global context to deep convolutional networks for semantic segmentation. The approach is simple, using the average feature for a layer to augment the features at each location. In addition, we study several idiosyncrasies of training, significantly increasing the performance of baseline networks (e.g. from FCN). When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines. Our proposed approach, ParseNet, achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines, and near current state-of-the-art performance on PASCAL VOC 2012 semantic segmentation with a simple approach. Code is available at https://github.com/weiliu89/caffe/tree/fcn .
| Year | Citations | |
|---|---|---|
Page 1
Page 1