Concepedia

Publication | Open Access

Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web

222

Citations

7

References

2000

Year

Xiaolan Zhu, Susan Gauch

Unknown Venue

TLDR

Most web information retrieval systems rely on similarity ranking based on term frequency, ignoring document quality and thus retrieving low‑quality documents. The study proposes integrating similarity ranking with quality ranking in both centralized and distributed search environments. Six quality metrics—currency, availability, information‑to‑noise ratio, authority, popularity, and cohesiveness—were evaluated for their impact on ranking. Incorporating currency, availability, information‑to‑noise ratio, and cohesiveness improved centralized search, while availability, information‑to‑noise ratio, popularity, and cohesiveness enhanced site selection, and adding popularity to fusion significantly boosted overall effectiveness.

Abstract

Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significant improvement. In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.

References

YearCitations

Page 1