Publication | Closed Access
A Comparison of Web Data Extraction Techniques
13
Citations
16
References
2019
Year
Structured Text DataWeb MiningHtml FilesInformation RetrievalData ScienceData MiningContent AnalysisEngineeringKeyword ExtractionData QualityData IntegrationComputer ScienceSemantic WebData ExtractionInformation ExtractionContent ProcessingText Mining
Extracting a structured text data from a published webpages has drawn attention in the last decade, the process of web data extraction has many challenges, due to variety of web data and the unstructured from of HTML files. The aim of this survey is to provide a comprehensive overview of current web data extraction techniques, in term of extracted data quality, where the redundant and the noise data should be eliminated. Merits and demerits for each web information extraction technique will be stated, and finally a classification framework for the discussed techniques will be provided.
| Year | Citations | |
|---|---|---|
Page 1
Page 1