Publication | Open Access
Projections for efficient document clustering
214
Citations
27
References
1997
Year
Unknown Venue
Clustering is increasing in importance, but linear-and even constant-time clustering algorithms are often too slow for real-time applications. A simple way to speed up clustering is to speed up the distance calculations at the heart of clustering routines. We study two techniques for improving the cost ofdistance calculations, LSI and trrmcation, and determine both how much these techniques speed up clustering and how much they affect the quality of the resulting clusters. We find that the speed increase is significant whilesurprisingly -the quality of clustering is not adversely affected. We conclude that truncation yields clusters as good as those produced by full-profile clustering while offering a significant speed advantage.
| Year | Citations | |
|---|---|---|
Page 1
Page 1