Publication | Closed Access
Lexical Semantic Relatedness with Random Graph Walks
183
Citations
24
References
2007
Year
EngineeringGeneralized Pagerank AlgorithmSemanticsSemantic WebSemantic SimilarityCorpus LinguisticsText MiningWord EmbeddingsApplied LinguisticsNatural Language ProcessingInformation RetrievalData ScienceComputational LinguisticsLanguage StudiesThesaurus GraphMarkov ChainComputational LexicologyKnowledge DiscoveryDistributional SemanticsVector Space ModelLexical Semantic RelatednessLinguisticsWord-sense Disambiguation
Many systems for tasks such as question answering, multi-document summarization, and information retrieval need robust numerical measures of lexical relatedness. Standard thesaurus-based measures of word pair similarity are based on only a single path between those words in the thesaurus graph. By contrast, we propose a new model of lexical semantic relatedness that incorporates information from every explicit or implicit path connecting the two words in the entire graph. Our model uses a random walk over nodes and edges derived from WordNet links and corpus statistics. We treat the graph as a Markov chain and compute a word-specific stationary distribution via a generalized PageRank algorithm. Semantic relatedness of a word pair is scored by a novel divergence measure, ZKL, that outperforms existing measures on certain classes of distributions. In our experiments, the resulting relatedness measure is the WordNet-based measure most highly correlated with human similarity judgments by rank ordering at = .90.
| Year | Citations | |
|---|---|---|
Page 1
Page 1