Publication | Closed Access
Multi-Label Image Recognition With Graph Convolutional Networks
1.2K
Citations
33
References
2019
Year
Unknown Venue
Convolutional Neural NetworkMultiple Instance LearningEngineeringMachine LearningGraph Signal ProcessingGraph ProcessingImage ClassificationImage AnalysisData ScienceLabel GraphPattern RecognitionSuch Important DependenciesLabel DependenciesMachine VisionGraph Convolutional NetworksFeature LearningComputer ScienceDeep LearningComputer VisionGraph TheoryGraph Neural Network
Multi‑label image recognition predicts all object labels present in an image, and modeling label co‑occurrence improves performance. The study proposes a GCN‑based multi‑label classification model to capture label dependencies. The model builds a directed graph of label embeddings, trains a GCN to generate inter‑dependent classifiers applied to image features, and employs a re‑weighted correlation matrix to guide propagation, allowing end‑to‑end training. Experiments on two datasets show the approach outperforms state‑of‑the‑art methods, and visualizations confirm the learned classifiers preserve meaningful semantic topology.
The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore, we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.
| Year | Citations | |
|---|---|---|
Page 1
Page 1