Publication | Open Access
A framework of feature selection methods for text categorization
112
Citations
16
References
2009
Year
Unknown Venue
EngineeringFrequency MeasurementFeature SelectionMultimodal Sentiment AnalysisSentiment AnalysisCorpus LinguisticsText MiningNatural Language ProcessingClassification MethodInformation RetrievalData ScienceData MiningDocument ClassificationContent AnalysisStatisticsAutomatic ClassificationText CategorizationKnowledge DiscoveryIntelligent ClassificationFeature Selection Methods
In text categorization, feature selection (FS) is a strategy that aims at making text classifiers more efficient and accurate. However, when dealing with a new task, it is still difficult to quickly select a suitable one from various FS methods provided by many previous studies. In this paper, we propose a theoretic framework of FS methods based on two basic measurements: frequency measurement and ratio measurement. Then six popular FS methods are in detail discussed under this framework. Moreover, with the guidance of our theoretical analysis, we propose a novel method called weighed frequency and odds (WFO) that combines the two measurements with trained weights. The experimental results on data sets from both topic-based and sentiment classification tasks show that this new method is robust across different tasks and numbers of selected features.
| Year | Citations | |
|---|---|---|
Page 1
Page 1