Efficient retrieval of the top-k most relevant spatial web objects

TLDR

The growing geo‑spatial dimension of the Internet, with web documents and points of interest being geo‑tagged, creates a new class of top‑k queries that must balance location proximity and text relevance, yet only naive methods currently exist to address this challenge. This paper introduces a novel indexing framework designed to support location‑aware top‑k text retrieval. The framework combines an inverted file for textual indexing with an R‑tree for spatial proximity, explores multiple indexing strategies, and employs algorithms that use these indexes to prune the search space while jointly considering relevance and distance. Experimental results demonstrate that the framework scales well and delivers excellent performance.

Abstract

The conventional Internet is acquiring a geo-spatial dimension. Web documents are being geo-tagged, and geo-referenced objects such as points of interest are being associated with descriptive text documents. The resulting fusion of geo-location and documents enables a new kind of top- k query that takes into account both location proximity and text relevancy. To our knowledge, only naive techniques exist that are capable of computing a general web information retrieval query while also taking location into account. This paper proposes a new indexing framework for location-aware top- k text retrieval. The framework leverages the inverted file for text retrieval and the R-tree for spatial proximity querying. Several indexing approaches are explored within the framework. The framework encompasses algorithms that utilize the proposed indexes for computing the top- k query, thus taking into account both text relevancy and location proximity to prune the search space. Results of empirical studies with an implementation of the framework demonstrate that the paper's proposal offers scalability and is capable of excellent performance.

References

Page 1

	Year	Citations

Page 1