Concepedia

Publication | Closed Access

A modified approach to keyword extraction based on word-similarity

11

Citations

11

References

2009

Year

Abstract

two keyword-extraction ways are usually used, one is simply using the information from exactly single word like word frequency and TF.IDF, the other is based on the relationship between words. The relationship is usually described as word similarity which derives from a corpus (WordNet, HowNet) or man-made thesaurus. With the information explosion nowdays, the words we using are growing and changing rapidly. A lot of new words are not specified in man-made corpus. This paper proposes a new method to build a word similarity thesaurus. Using the semantic information from the thesaurus, together with TF.IDF and word's first occurrence, a keyword extraction algorithm is demonstrated, the results and analysis are also given.

References

YearCitations

Page 1