Publication | Closed Access
An information-theoretic measure for document similarity
71
Citations
2
References
2003
Year
Unknown Venue
EngineeringSimilarity MeasureSemantic WebCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningComputational LinguisticsLanguage StudiesDocument SimilarityDocument ClusteringInformation TheorySimilarity SearchKnowledge DiscoveryComputer SciencePairwise Document SimilarityPairwise Object SimilarityVector Space ModelLinguisticsSemantic Similarity
Recent work has demonstrated that the assessment of pairwise object similarity can be approached in an axiomatic manner using information theory. We extend this concept specifically to document similarity and test the effectiveness of an information-theoretic measure for pairwise document similarity. We adapt query retrieval to rate the quality of document similarity measures and demonstrate that our proposed information-theoretic measure for document similarity yields statistically significant improvements over other popular measures of similarity.
| Year | Citations | |
|---|---|---|
Page 1
Page 1