Publication | Open Access
Scalable and Effective Generative Information Retrieval
19
Citations
20
References
2024
Year
Unknown Venue
EngineeringMachine LearningIntelligent Information RetrievalQuery ModelGenerative RetrievalText MiningNatural Language ProcessingInformation RetrievalData ScienceRelevance FeedbackDocument IdQuery ExpansionMachine TranslationTransformer NetworksKnowledge DiscoveryComputer ScienceDeep LearningRetrieval Augmented GenerationInteractive Information Retrieval
Recent research has shown that transformer networks can be used as differentiable search indexes by representing each document as a sequence of document ID tokens. These generative retrieval models cast the retrieval problem to a document ID generation problem for each query. Despite their elegant design, existing generative retrieval models only perform well on artificially-constructed and small-scale collections. This paper represents an important milestone in generative retrieval research by showing that generative retrieval models can be trained to perform effectively on large-scale standard retrieval benchmarks. In more detail, we propose RIPOR- an optimization framework for generative retrieval that is designed based on two often-overlooked fundamental design considerations. First, RIPOR introduces a novel prefix-oriented ranking optimization algorithm for accurate estimation of relevance score during sequential document ID generation. Second, RIPOR constructs document IDs based on the relevance associations between queries and documents. Evaluation on MSMARCO and TREC Deep Learning Track reveals that RIPOR surpasses state-of-the-art generative retrieval models by a large margin (e.g., 30.5% MRR improvements on MS MARCO Dev Set).
| Year | Citations | |
|---|---|---|
Page 1
Page 1