Concepedia

Publication | Closed Access

TOPSIG

29

Citations

12

References

2011

Year

Abstract

Comparisons between file signatures and inverted files for text retrieval have shown the shortcomings of traditional file signatures. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures that extends recent advances in semantic hashing and dimensionality reduction. These were not so far linked to general purpose, signature file based, search engines. We demonstrate significant improvements in the performance of signature file based indexing and retrieval. Performance is comparable to the state of the art inverted file based systems, including language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and positions the file signatures model in the class of Vector Space retrieval models.

References

YearCitations

Page 1