Publication | Closed Access
Trainable table location in document images
97
Citations
7
References
2003
Year
Unknown Venue
EngineeringTrainable Table LocationDocument Image AnalysisImage DatabaseImage SearchCommercial OcrsLocalizationText MiningImage AnalysisInformation RetrievalData SciencePattern RecognitionText RecognitionMachine VisionOptical Character RecognitionComputer ScienceTable LocationComputer VisionMxy TreeDocument ProcessingContent-based Image Retrieval
We describe an approach for table location in document images. The documents are described by means of a hierarchical representation that is based on the MXY tree. The presence of a table is hypothesized by searching parallel lines in the MXY tree of the page. This hypothesis is afterwards verified by locating perpendicular lines or white spaces in the region included between the parallel lines. Lastly, located tables can be merged on the basis of proximity and similarity criteria. The use of an optimization method, that relies on the definition of an appropriate table location index, allows us to identify, the optimal values of thresholds involved in the algorithm. In this way the algorithm can be adapted to recognize tables with different features by maximizing the performance on an appropriate training set. The algorithm has been evaluated on two data-sets containing more than 1500 pages, and comparing its results with the tables identified by two commercial OCRs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1