Publication | Closed Access
Application of Genetic Algorithm in Document Clustering
17
Citations
9
References
2009
Year
Unknown Venue
Evolutionary Data MiningCluster ComputingDocument ClusteringEngineeringInformation RetrievalData ScienceData MiningSparse MatrixKnowledge DiscoveryDocument ClassificationGenetic AlgorithmFuzzy ClusteringText Mining
By researching all kinds of methods for document clustering, we put forward a new dynamic method based on genetic algorithm (GA). K-means is a greedy algorithm, which is sensitive to the choice of cluster center and very easily results in local optimization. Genetic algorithm is a global convergence algorithm, which can find the best cluster centers easily. Among the traditional document clustering methods, the document similar matrix is a sparse matrix. In this paper, we propose some new formulas improved on the traditional method. Then, we make some improvement on genetic algorithm. All individuals are encoded by floating-point number and the sum of mean square deviation of intra-class distance is adopted as the objective function. The steps of the algorithm are given in detail. The experimental results show that the accuracy of GA can reach over 98 percent and generate better clustering result than K-means.
| Year | Citations | |
|---|---|---|
Page 1
Page 1