Publication | Closed Access
A rule-based system for document image segmentation
118
Citations
9
References
2002
Year
Unknown Venue
Document ProcessingEngineeringDocument Image AnalysisImage AnalysisInformation RetrievalData ScienceMathematical MorphologyPattern RecognitionText RecognitionText SegmentationWord Segmentation (Natural Language Processing)Text RegionsDocument Image SegmentationMachine VisionOptical Character RecognitionWord Segmentation (Phonological Awareness)Computer ScienceComputer VisionRule-based SystemArtsDocument ImageImage Segmentation
A rule-based system for automatically segmenting a document image into regions of text and nontext is presented. The initial stages of the system perform image enhancement functions such as adaptive thresholding, morphological processing, and skew detection and correction. The image segmentation process consists of smearing the original image via the run length smoothing algorithm, calculating the connected components locations and statistics, and filtering (segmenting) the image based on these statistics. The text regions can be converted (via an optical character reader) to a computer-searchable form, and the nontext regions can be extracted and preserved. The rule-based structure allows easy fine tuning of the algorithmic steps to produce robust rules, to incorporate additional tools (as they become available), and to handle special segmentation needs.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1