Publication | Open Access
Learning to recognize tables in free text
76
Citations
2
References
1999
Year
Unknown Venue
EngineeringMachine LearningKnowledge ExtractionDeterministic Table RecognitionText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningPattern RecognitionComputational LinguisticsDocument ClassificationMany Real-world TextsFree TextKnowledge DiscoveryComputer ScienceInformation ExtractionText ProcessingDocument Processing
Many real-world texts contain tables. In order to process these texts correctly and extract the information contained within the tables, it is important to identify the presence and structure of tables. In this paper, we present a new approach that learns to recognize tables in free text, including the boundary, rows and columns of tables. When tested on Wall Street Journal news documents, our learning approach outperforms a deterministic table recognition algorithm that identifies table recognition algorithm that identifies tables based on a fixed set of conditions. Our learning approach is also more flexible and easily adaptable to texts in different domains with different table characteristics.
| Year | Citations | |
|---|---|---|
Page 1
Page 1