Publication | Closed Access
Concept Expansion Using Web Tables
50
Citations
30
References
2015
Year
Unknown Venue
Ranking AlgorithmEngineeringKnowledge ExtractionLearning To RankSemanticsSemantic WebText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningQuery ExpansionSemantic DriftEntity DisambiguationKnowledge DiscoveryFormal Concept AnalysisFollowing ProblemSeed EntitiesWeb Intelligence
We study the following problem: given the name of an ad-hoc concept as well as a few seed entities belonging to the concept, output all entities belonging to it. Since producing the exact set of entities is hard, we focus on returning a ranked list of entities. Previous approaches either use seed entities as the only input, or inherently require negative examples. They suffer from input ambiguity and semantic drift, or are not viable options for ad-hoc tail concepts. In this paper, we propose to leverage the millions of tables on the web for this problem. The core technical challenge is to identify the ``exclusive'' tables for a concept to prevent semantic drift; existing holistic ranking techniques like personalized PageRank are inadequate for this purpose. We develop novel probabilistic ranking methods that can model a new type of table-entity relationship. Experiments with real-life concepts show that our proposed solution is significantly more effective than applying state-of-the-art set expansion or holistic ranking techniques.
| Year | Citations | |
|---|---|---|
Page 1
Page 1