Concepedia

Publication | Closed Access

Handwritten Optical Character Recognition system for Sindhi numerals

17

Citations

8

References

2016

Year

Abstract

Sindhi language is script language like Arabic and Persian. It's origin is 2500 years old and spoken in various countries in Asia. In this paper, we propose an Optical Character Recognition (OCR) system which recognizes handwritten Sindhi numeral expressions (i.e. Sindhi handwritten numeral strings) without using common input devices such as keyboard and storage device memory. Our experiments focus on character recognition which later can be used for various applications such as tutoring, mathematical kids games, and automatic telephone number conversion from sign boards in India and Pakistan. In our research, we investigate the correlation between the numeral shapes and apply famous state-of-the art classifier based on correlation based template matching. We experimentally show that template matching gives poor performance as the shapes of numerals are highly correlated. There exists little volume of literature to address OCR on Sindhi language but unavailability of benchmark dataset makes it difficult for researchers around the world to re-implement the literature frameworks. We provide two sets of images which can be used for training and prediction.

References

YearCitations

Page 1