Concepedia

Publication | Closed Access

The TOPHITS Model for Higher-Order Web Link Analysis∗

155

Citations

39

References

2006

Year

TLDR

As the web grows, analyzing link structure with contextual information becomes increasingly important, and multilinear algebra offers a novel way to incorporate anchor text and other data into authority computations used by methods such as HITS. The paper presents a faster algorithm for computing the TOPHITS model on sparse data, compares it to HITS using web data, and outlines query‑response methods along with experimental results. TOPHITS employs a PARAFAC decomposition of a three‑way web data tensor to jointly compute hubs, authorities, and anchor‑text terms, and the paper introduces a faster algorithm for sparse data and discusses its use in context‑sensitive queries. TOPHITS extends HITS by enabling offline, higher‑order analysis that uncovers latent page groupings and term associations, and experimental results demonstrate its effectiveness in query response.

Abstract

As the size of the web increases, it becomes more and more important to analyze link structure while also considering context. Multilinear algebra provides a novel tool for incorporating anchor text and other information into the authority computation used by link analysis methods such as HITS. Our recently proposed TOPHITS method uses a higher-order analogue of the matrix singular value decomposition called the PARAFAC model to analyze a three-way representation of web data. We compute hubs and authorities together with the terms that are used in the anchor text of the links between them. Adding a third dimension to the data greatly extends the applicability of HITS because the TOPHITS analysis can be performed in advance and offline. Like HITS, the TOPHITS model reveals latent groupings of pages, but TOPHITS also includes latent term information. In this paper, we describe a faster mathematical algorithm for computing the TOPHITS model on sparse data, and Web data is used to compare HITS and TOPHITS. We also discuss how the TOPHITS model can be used in queries, such as computing context-sensitive authorities and hubs. We describe different query response methodologies and present experimental results.

References

YearCitations

Page 1