Publication | Closed Access
Phrase-based document categorization revisited
25
Citations
10
References
2009
Year
Unknown Venue
EngineeringIntelligent Information RetrievalSemanticsCorpus LinguisticsText MiningNatural Language ProcessingApplied LinguisticsInformation RetrievalAka ClassificationComputational LinguisticsDocument ClassificationLanguage StudiesAutomatic CategorizationPhrase-based Document CategorizationAutomatic ClassificationKnowledge DiscoveryTerminology ExtractionVector Space ModelKeyword ExtractionLinguistics
This paper takes a fresh look at an old idea in Information Retrieval: the use of linguistically extracted phrases as terms in the automatic categorization (aka classification) of documents. Until now, there was found little or no evidence that document categorization benefits from the application of linguistics techniques. Classification algorithms using the most cleverly designed linguistical representations typically do no better than those using simply the bag-of-words representation. Shallow linguistical techniques are used routinely, but their positive effect on the accuracy is small at best.
| Year | Citations | |
|---|---|---|
Page 1
Page 1