Publication | Closed Access
Improvements in hidden Markov model based Arabic OCR
41
Citations
7
References
2008
Year
EngineeringBiometricsCorpus LinguisticsSpeech RecognitionNatural Language ProcessingImage AnalysisLanguage DocumentationMachine-printed Arabic DocumentsArabicPattern RecognitionHidden Markov ModelComputational LinguisticsText RecognitionLanguage StudiesCharacter RecognitionMachine TranslationOptical Character RecognitionComputer ScienceArabic OcrText ProcessingLanguage ModelingLinguisticsDocument Processing
This paper describes recent advances in hidden Markov model (HMM) based OCR for machine-printed arabic documents. A combination of script-independent and script-specific techniques are applied to glyph models and language models (LM). Script-independent techniques we applied are higher order n-gram LMs for N-best rescoring and discriminative estimation of glyph HMMs. Arabic specific techniques include the use of context-dependent HMMs for glyph modeling and Parts-of-Arabic-Words in language modeling. We present experimental results that demonstrate a 40% relative reduction in word error rate over the baseline configuration on a corpus of machine-printed Arabic documents.
| Year | Citations | |
|---|---|---|
Page 1
Page 1