Publication | Closed Access
DSAN: Double Supervised Network with Attention Mechanism for Scene Text Recognition
16
Citations
16
References
2019
Year
Unknown Venue
Natural Language ProcessingMultimodal LlmImage AnalysisMachine LearningMachine VisionEngineeringPattern RecognitionText-to-image RetrievalText RecognitionVisual GroundingFeature ExtractionVision Language ModelVisual Question AnsweringAttention MechanismText Attention ModuleDeep LearningScene Text RecognitionComputer Vision
In this paper, we propose Double Supervised Network with Attention Mechanism (DSAN), a novel end-to-end trainable framework for scene text recognition. It incorporates one text attention module during feature extraction which enforces the model to focus on text regions and the whole framework is supervised by two branches. One supervision comes from context-level modelling branch and another comes from one extra supervision enhancement branch which aims at tackling inexplicit semantic information at character level. These two supervisions can benefit each other and yield better performance. The proposed approach can recognize text in arbitrary length and does not need any predefined lexicon. Our method achieves the current state-of-the-art results on three text recognition benchmarks: IIIT5K, ICDAR2013 and SVT reaching accuracy 88.6%, 92.3% and 84.1% respectively which suggests the effectiveness of the proposed method.
| Year | Citations | |
|---|---|---|
Page 1
Page 1