Publication | Closed Access
Graph-Based Term Weighting for Text Categorization
54
Citations
30
References
2015
Year
Unknown Venue
EngineeringGraph-based Term WeightingCorpus LinguisticsSentiment AnalysisText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningDocument ClassificationLanguage StudiesContent AnalysisAutomatic ClassificationText CategorizationKnowledge DiscoveryTerminology ExtractionWeighting SchemeVector Space ModelKeyword ExtractionLinguistics
Text categorization is an important task with plenty of applications, ranging from sentiment analysis to automated news classification. In this paper, we introduce a novel graph-based approach for text categorization. Contrary to the traditional Bag-of-Words model for document representation, we consider a model in which each document is represented by a graph that encodes relationships between the different terms. The importance of a term to a document is indicated using graph-theoretic node centrality criteria. The proposed weighting scheme is able to meaningfully capture the relationships between the terms that co-occur in a document, creating feature vectors that can improve the categorization task. We perform experiments in well-known document collections, applying popular classification algorithms. Our preliminary results indicate that the proposed graph-based weighting mechanism is able to outperform existing frequency-based term weighting criteria, under appropriate parameter setting.
| Year | Citations | |
|---|---|---|
Page 1
Page 1