Publication | Closed Access
An evaluation of corpus-driven measures of medical concept similarity for information retrieval
42
Citations
11
References
2012
Year
Unknown Venue
EngineeringSemantic SearchIntelligent Information RetrievalSimilarity MeasureSemantic WebSemanticsCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningMedical Information RetrievalComputational LinguisticsCorpus-driven MeasuresLanguage StudiesBiomedical Text MiningSemantic Similarity MeasuresKnowledge RetrievalKnowledge DiscoveryMedical Concept SimilarityTerminology ExtractionDistributional SemanticsLinguisticsHealth InformaticsSemantic Similarity
Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.
| Year | Citations | |
|---|---|---|
Page 1
Page 1