A Novel Connectionist System for Unconstrained Handwriting Recognition

TLDR

Recognizing unconstrained handwritten text is difficult because of segmentation challenges and the need for contextual modeling, and progress has mainly come from preprocessing or language modeling while basic recognition algorithms remain largely based on outdated hidden Markov models. This study proposes a novel recurrent neural network tailored for sequence labeling in hard‑to‑segment handwriting data with long‑range bidirectional dependencies. The network architecture is a custom recurrent neural network designed to handle unsegmented sequences and capture long‑range context bidirectionally. Experiments on two large handwriting databases show the network achieves 79.7 % online and 74.1 % offline word accuracy, surpassing a state‑of‑the‑art HMM system, and it remains robust to lexicon size while revealing the influence of hidden layers and contextual use.

Abstract

Recognizing lines of unconstrained handwritten text is a challenging task. The difficulty of segmenting cursive or overlapping characters, combined with the need to exploit surrounding context, has led to low recognition rates for even the best current recognizers. Most recent progress in the field has been made either through improved preprocessing or through advances in language modeling. Relatively little work has been done on the basic recognition algorithms. Indeed, most systems rely on the same hidden Markov models that have been used for decades in speech and handwriting recognition, despite their well-known shortcomings. This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. In experiments on two large unconstrained handwriting databases, our approach achieves word recognition accuracies of 79.7 percent on online data and 74.1 percent on offline data, significantly outperforming a state-of-the-art HMM-based system. In addition, we demonstrate the network's robustness to lexicon size, measure the individual influence of its hidden layers, and analyze its use of context. Last, we provide an in-depth discussion of the differences between the network and HMMs, suggesting reasons for the network's superior performance.

References

Page 1

	Year	Citations
Long Short-Term Memory Sepp Hochreiter, Jürgen Schmidhuber Neural Computation	1997	93.8K
A tutorial on hidden Markov models and selected applications in speech recognition L. R. Rabiner Proceedings of the IEEE EngineeringMachine LearningHidden StatesDiscrete Markov ChainsSpeech Recognition	1989	22.6K
Bidirectional recurrent neural networks Mike Schuster, Kuldip K. Paliwal IEEE Transactions on Signal Processing Natural Language ProcessingStructured PredictionConditional Posterior ProbabilityEngineeringMachine Learning	1997	9.6K
Learning long-term dependencies with gradient descent is difficult Yoshua Bengio, P. Simard, Paolo Frasconi IEEE Transactions on Neural Networks Structured PredictionGradient DescentEngineeringMachine LearningData Science	1994	8.3K
Connectionist temporal classification Alex Graves, Santiago Fernández, Faustino Gomez, EngineeringMachine LearningSpoken Language ProcessingRecurrent Neural NetworkSpeech Recognition	2006	5.3K
Framewise phoneme classification with bidirectional LSTM and other neural network architectures Alex Graves, Jürgen Schmidhuber Neural Networks Natural Language ProcessingFramewise Phoneme ClassificationEngineeringMachine LearningSpeech Processing	2005	5.2K
Online and off-line handwriting recognition: a comprehensive survey Réjean Plamondon, Sargur N. Srihari IEEE Transactions on Pattern Analysis and Machine Intelligence New TechnologiesEngineeringMachine LearningHandwritingBiometrics	2000	2.5K
The IAM-database: an English sentence database for offline handwriting recognition Urs-Viktor Marti, Horst Bunke International Journal on Document Analysis and Recognition (IJDAR) Natural Language ProcessingEngineeringText RecognitionComputational LinguisticsHandwriting	2002	1.4K
The state of the art in online handwriting recognition Charles C. Tappert, Ching Y. Suen, Toru Wakahara IEEE Transactions on Pattern Analysis and Machine Intelligence Image AnalysisShape Recognition AlgorithmsRecognition ProblemsEngineeringPattern Recognition	1990	842
Framewise phoneme classification with bidirectional LSTM networks Alexander Graves, Jürgen Schmidhuber Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. Lstm Learning AlgorithmEngineeringMachine LearningSpoken Language ProcessingBidirectional Lstm	2006	566

Page 1