Publication | Closed Access
A Survey on Text Mining and Sentiment Analysis for Unstructured Web Data
19
Citations
4
References
2015
Year
EngineeringBusiness IntelligenceComputational AnalysisUnstructured Web DataCorpus LinguisticsSentiment AnalysisText MiningContent MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningContent AnalysisSocial Medium MiningWeb DataUnstructured DataKnowledge DiscoveryWeb Mining (Data Mining)Web Text MiningInformation ExtractionSocial Media PlatformsSocial Media MiningWeb Mining (Geotechnical Engineering)Web MiningWeb IntelligenceKeyword ExtractionArts
Unstructured data refers to information that doesn’t have a pre-defined data archetype. Unstructured information is typically textual data, but may also contain numerical data, and factual details. This results in data that is obscure, irregular and ambiguous, thus making it difficult to analyse using conventional computing means. Much of the data in the web, in the form of blogs, news, social media platforms is unstructured. But they serve as a potential vast source of information, if processed efficiently. In this paper, the basics of harnessing unstructured data from the web and the techniques to process it are discussed. The concepts of web crawling, text mining and natural language processing are discussed in brief, to give an outline of how web data is processed and analysed. Sentiment Analysis, which is a major aspect of present day NLP, is also described, along with issue of mining from Twitter, which has emerged as the most important data source for NLP in the recent past. The paper concludes with a brief outline of the use of web data mining and analysis, and the potential for future growth in the field.
| Year | Citations | |
|---|---|---|
Page 1
Page 1