Cluster analysis for hypertext systems

Abstract

Identifying nodes of information that are highly related has many applications in any information systems, and in particular in hypertext systems. In this paper we present a technique to identify “natural” clusters in a hypertext. A natural cluster is a cluster that is not arbitrary, but depends only on intrinsic properties of the hypertext. In our case, the property we will use to identify the clusters is the number of independent paths between nodes. Using the graph theoretic definition of k-edge-components we present an aggregation technique to cluster the nodes. We then use this techniques to cluster three medium sized hypertexts that were developed by different authors for different users, using different methodologies. We also show how to use clustering to improve data display, browsing and retrieval.

References

Page 1

	Year	Citations

Page 1