Concepedia

Publication | Closed Access

Toward Entity Retrieval over Structured and Text Data

16

Citations

29

References

2004

Year

Abstract

Many real-world applications increasingly involve both structured data and text. Hence, managing both in an efficient and integrated manner has received much attention from both the IR and database communities. To date, however, little research has been devoted to semantic issues in the integration of text and data. In this paper we introduced a problem in this realm: entity retrieval. Given data fragments that describe various aspects of a real-world entity, find all other data fragments as well as text documents that describe that same entity. As such, entity retrieval is a novel retrieval problem, which differs from both regular text retrieval and database search in that it explicitly requires matching information at the semantic level; matching syntactically as done in the current search engines and relational databases would be inherently non-optimal. We define entity retrieval and conduct a case study of retrieving information about a researcher from both the Web and a bibliographic database (DBLP). We propose several methods for exploiting the structured information in the database to improve entity retrieval over the text collection. Specifically, we present a query expansion mechanism based on extracted information from structured data. Experiment results show that selectively using more structured information to expand the text query improves entity retrieval performance on text. We conclude the paper with future research directions for entity retrieval.

References

YearCitations

Page 1