Publication | Closed Access
Olex: Effective Rule Learning for Text Categorization
22
Citations
25
References
2008
Year
EngineeringRule-based Text ClassifiersText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningEffective Rule LearningComputational LinguisticsDocument ClassificationLanguage StudiesAutomatic ClassificationNaive BayesKnowledge DiscoveryIntelligent ClassificationComputer ScienceGrammar InductionAutomated ReasoningRule InductionLinguisticsAutomatic Induction
This paper describes Olex, a novel method for the automatic induction of rule-based text classifiers. Olex supports a hypothesis language of the form "if T <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> or hellip or T <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sub> occurs in document d, and none of T <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1+n</sub> ,... T <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n+m</sub> occurs in d, then classify d under category c," where each T <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> is a conjunction of terms. The proposed method is simple and elegant. Despite this, the results of a systematic experimentation performed on the REUTERS-21578, the OHSUMED, and the ODP data collections show that Olex provides classifiers that are accurate, compact, and comprehensible. A comparative analysis conducted against some of the most well-known learning algorithms (namely, Naive Bayes, Ripper, C4.5, SVM, and Linear Logistic Regression) demonstrates that it is more than competitive in terms of both predictive accuracy and efficiency.
| Year | Citations | |
|---|---|---|
Page 1
Page 1