Concepedia

TLDR

Table‑form document structure analysis is a key challenge in document processing, yet most prior work relies on line‑oriented methods that struggle with duplicated, overlaid documents where characters touch cells and lines break. The study introduces Box Driven Reasoning (BDR) to robustly analyze table‑form documents containing touching characters and broken lines. BDR operates directly on regions rather than lines, enabling robust handling of touching characters and broken lines. Experiments demonstrate that BDR reliably recognizes cells and strings in document images with touching characters and broken lines.

Abstract

Table form document structure analysis is an important problem in the document processing domain. The paper presents a method called Box Driven Reasoning (BDR) to robustly analyze the structure of table form documents which include touching characters and broken lines. Most previous methods employ a line oriented approach. Real documents are copied repeatedly and overlaid with printed data, resulting in characters which touch cells and lines which are broken. BDR deals with regions directly, in contrast with other previous methods. Experimental tests show that BDR reliably recognizes cells and strings in document images with touching characters and broken lines.

References

YearCitations

Page 1