Publication | Closed Access
Toward a practical document understanding of table-form documents: its framework and knowledge representation
30
Citations
6
References
2002
Year
Unknown Venue
EngineeringStructural Pattern RecognitionKnowledge ExtractionStructured DataSemantic WebFour-layer Recognition ProcessesImage AnalysisInformation RetrievalData ScienceData MiningPattern RecognitionDocument EngineeringDocument UnderstandingManagementTable-form DocumentsData IntegrationData ManagementKnowledge RepresentationKnowledge Representation MethodComputer ScienceStatistical Pattern RecognitionPractical Document UnderstandingClassification TreeKnowledge BaseKnowledge ManagementStructured DocumentDocument ProcessingData ModelingPattern Recognition Application
A framework of four-layer recognition processes is proposed for understanding documents, and a knowledge representation method adaptable to the understanding of table-form documents is addressed. Although Y. Nakano et al. (1986) looked upon the recognition of multi-kinds of table-form documents as an important subject from a practical point of view, they could not report any successful approach because their knowledge was based only on the physical coordinate data. In the approach presented, this recognition issue was solved, using both the classification tree based on the physical characteristics and the structure description tree based on the logical characteristics. At least, it is not so difficult to classify various kinds of documents into appropriate document classes since table-form documents are well designed on the basis of vertical and horizontal line segments. However, it is not easy in the case of the other documents because the geometric and spatial characteristics of documents are not well specified. It is necessary to investigate the application techniques for the other documents from the viewpoint of the knowledge representation.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1