Publication | Open Access
Mining tables from large scale HTML texts
164
Citations
6
References
2000
Year
Unknown Venue
EngineeringKnowledge ExtractionTable RecognitionTable ExtractionSemantic WebCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningPattern RecognitionTable CellsKnowledge DiscoveryComputer ScienceInformation ExtractionWeb MiningStructure MiningData Extraction
Table is a very common presentation scheme, but few papers touch on table extraction in text data mining. This paper focuses on mining tables from large-scale HTML texts. Table filtering, recognition, interpretation, and presentation are discussed. Heuristic rules and cell similarities are employed to identify tables. The F-measure of table recognition is 86.50%. We also propose an algorithm to capture attribute-value relationships among table cells. Finally, more structured data is extracted and presented.
| Year | Citations | |
|---|---|---|
Page 1
Page 1