Publication | Closed Access
Form reading based on form-type identification and form-data recognition
28
Citations
8
References
2005
Year
Unknown Venue
EngineeringBiometricsKeyword DictionaryPsycholinguisticsForm ReadingCorpus LinguisticsLanguage ProcessingSpeech RecognitionNatural Language ProcessingImage AnalysisInformation RetrievalData ScienceLanguage DocumentationPattern RecognitionComputational LinguisticsLanguage StudiesCharacter RecognitionForm-data RecognitionOptical Character RecognitionComputer ScienceSignal ProcessingSpeech AcquisitionLanguage RecognitionSpeech ProcessingForm-type IdentificationText ProcessingLinguisticsDocument ProcessingPattern Recognition Application
Form reading technology based on form-typeidentification and form-data recognition is proposed. Thistechnology can solve difficulties in variety for readingdifferent items on fairly large number of different types offorms. The form-type identification consists of two parts:(i) extraction of targets such as important keywords in aform by matching between recogised characters and wordstrings in a keyword dictionary, and (ii) analysis ofpositional or semantic relationship between the targets byconstellation matching between these targets and wordlocation information in the keyword dictionary. The formdatarecognition consists of two parts: (i) extraction of aregion of interest (ROI) contained a character string of theitem by using a layout knowledge of the very form-type,and (ii) character string recognition of the item by usingthe linguistic constraint which can be obtained from acontent knowledge of the form-type. A experiment using642 sample forms with 107 different types in totalconfirmed that the form-type identification method cancorrectly identify 97% of 642 form samples at a rejectionrate 3%. Another experiment confirmed that the form-data recognition method can correctly read 95% of thenumber of items on the form samples.
| Year | Citations | |
|---|---|---|
Page 1
Page 1