Concepedia

Publication | Open Access

Projections for efficient document clustering

214

Citations

27

References

1997

Year

Abstract

Clustering is increasing in importance, but linear-and even constant-time clustering algorithms are often too slow for real-time applications. A simple way to speed up clustering is to speed up the distance calculations at the heart of clustering routines. We study two techniques for improving the cost ofdistance calculations, LSI and trrmcation, and determine both how much these techniques speed up clustering and how much they affect the quality of the resulting clusters. We find that the speed increase is significant whilesurprisingly -the quality of clustering is not adversely affected. We conclude that truncation yields clusters as good as those produced by full-profile clustering while offering a significant speed advantage.

References

YearCitations

Page 1