Concepedia

Publication | Closed Access

A Comparative Study on Chinese Text Categorization Methods.

57

Citations

12

References

2000

Year

Abstract

. This paper reports our comparativeevaluation of three machine learning methods on Chinese text categorization. Whereas a wide range of methods have been applied to English text categorization, relatively few studies have been done on Chinese text categorization. Based onaPeople's Daily news corpus, a series of controlled experiments evaluate three machine learning methods, namely k Nearest Neighbor (kNN) algorithm, Support Vector Machines (SVM), and Adaptive Resonance Associative Map (ARAM), in terms of their capabilities in mining categorization knowledge from high dimensional, sparse, and relatively noisy document feature vectors. Experiments reveal that all three methods produce satisfactory performance on the test corpus while ARAM exhibits a marginally better generalization capability, especially from relatively small and noisy training sets.

References

YearCitations

Page 1