Publication | Closed Access
Computing Semantic Similarity of Concepts in Knowledge Graphs
229
Citations
45
References
2016
Year
EngineeringSemanticsSemantic WebCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalData ScienceLanguage StudiesKnowledge RepresentationKnowledge DiscoveryTerminology ExtractionComputer ScienceKnowledge GraphsSemantic Similarity MethodSemantic Similarity MethodsSemantic NetworkSemantic GraphLinguisticsSemantic Similarity
Previous semantic similarity methods have focused on either network structure or information content, but corpus‑based IC requires costly domain corpora. The authors propose wpath, a method that combines information‑content weighting with shortest‑path length to measure semantic similarity in Knowledge Graphs. wpath computes IC from concept distributions over KG instances and weights the shortest path length between concepts. Experiments on word‑similarity datasets and category‑classification tasks show that wpath significantly outperforms existing methods in accuracy and F‑score.
This paper presents a method for measuring the semantic similarity between concepts in Knowledge Graphs (KGs) such as WordNet and DBpedia. Previous work on semantic similarity methods have focused on either the structure of the semantic network between concepts (e.g., path length and depth), or only on the Information Content (IC) of concepts. We propose a semantic similarity method, namely wpath, to combine these two approaches, using IC to weight the shortest path length between concepts. Conventional corpus-based IC is computed from the distributions of concepts over textual corpus, which is required to prepare a domain corpus containing annotated concepts and has high computational cost. As instances are already extracted from textual corpus and annotated by concepts in KGs, graph-based IC is proposed to compute IC based on the distributions of concepts over instances. Through experiments performed on well known word similarity datasets, we show that the wpath semantic similarity method has produced a statistically significant improvement over other semantic similarity methods. Moreover, in a real category classification evaluation, the wpath method has shown the best performance in terms of accuracy and F score.
| Year | Citations | |
|---|---|---|
Page 1
Page 1