Concepedia

Publication | Closed Access

Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification

260

Citations

47

References

2018

Year

TLDR

The model leverages high‑level human knowledge in fashion analysis. This paper proposes a knowledge‑guided fashion network to solve visual fashion analysis tasks such as landmark localization and clothing category classification. The network incorporates two fashion grammars—dependency and symmetry—processed by Bidirectional Convolutional Recurrent Neural Networks and enhanced with landmark‑aware and category‑driven attention mechanisms to regularize landmark layouts and improve classification. Experimental results on large‑scale fashion datasets show that the proposed fashion grammar network outperforms baselines, with the attention mechanisms learning domain‑knowledge‑centered and goal‑driven representations.

Abstract

This paper proposes a knowledge-guided fashion network to solve the problem of visual fashion analysis, e.g., fashion landmark localization and clothing category classification. The suggested fashion model is leveraged with high-level human knowledge in this domain. We propose two important fashion grammars: (i) dependency grammar capturing kinematics-like relation, and (ii) symmetry grammar accounting for the bilateral symmetry of clothes. We introduce Bidirectional Convolutional Recurrent Neural Networks (BCRNNs) for efficiently approaching message passing over grammar topologies, and producing regularized landmark layouts. For enhancing clothing category classification, our fashion network is encoded with two novel attention mechanisms, i.e., landmark-aware attention and category-driven attention. The former enforces our network to focus on the functional parts of clothes, and learns domain-knowledge centered representations, leading to a supervised attention mechanism. The latter is goal-driven, which directly enhances task-related features and can be learned in an implicit, top-down manner. Experimental results on large-scale fashion datasets demonstrate the superior performance of our fashion grammar network.

References

YearCitations

Page 1