Concepedia

Publication | Closed Access

Design & development of rule based inflectional and derivational Urdu stemmer ‘Usal’

18

Citations

9

References

2015

Year

Abstract

Urdu is a morphologically rich language that means Urdu words having different variant form of words. In Natural Language Processing, morphology plays an important role. Morphology means study of word structure. In this paper, we focused on Urdu language and developed inflectional and derivational rule based Urdu stemmer. Stemming is a branch of morphology. In general, we can say that Stemming is a process of extracting `root' word from its actual word and separate the affixes. Through this simple rule based stemming algorithm, raised the problem of under-stemming and over-stemming. To reduce the problem of under-stemming, we have used longest suffix stripping algorithm and to reduce the problem of over-stemming, we have created database of exception words and stop-words.

References

YearCitations

Page 1