Publication | Closed Access
Question answering on interlinked data
50
Citations
25
References
2013
Year
Unknown Venue
EngineeringSemantic SearchSemantic WebText MiningNatural Language ProcessingBenchmark QueriesUser Supplied QueriesInformation RetrievalData ScienceComputational LinguisticsManagementData IntegrationLinked DataQuestion AnsweringEntity DisambiguationNatural Language InterfaceKnowledge DiscoveryQuery OptimizationFormal Queries
The Data Web contains a wealth of knowledge on a large number of domains. Question answering over interlinked data sources is challenging due to two inherent characteristics. First, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain question. Second, constructing a federated formal query across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present a question answering system, which transforms user supplied queries (i.e. natural language sentences or keywords) into conjunctive SPARQL queries over a set of interlinked data sources. The contribution of this paper is two-fold: Firstly, we introduce a novel approach for determining the most suitable resources for a user-supplied query from different datasets (disambiguation). We employ a hidden Markov model, whose parameters were bootstrapped with different distribution functions. Secondly, we present a novel method for constructing a federated formal queries using the disambiguated resources and leveraging the linking structure of the underlying datasets. This approach essentially relies on a combination of domain and range inference as well as a link traversal method for constructing a connected graph which ultimately renders a corresponding SPARQL query. The results of our evaluation with three life-science datasets and 25 benchmark queries demonstrate the effectiveness of our approach.
| Year | Citations | |
|---|---|---|
Page 1
Page 1