Publication | Closed Access
Stop-word removal algorithm for arabic language
44
Citations
0
References
2004
Year
Unknown Venue
Natural Language ProcessingEngineeringInformation RetrievalArabicText ProcessingText SegmentationComputational LinguisticsString ProcessingKeyword ExtractionStemmingFinite State MachineLanguage StudiesArabic LanguageInformation ExtractionLinguisticsText MiningSpelling Normalization
Summary form only given. We have designed and implemented an efficient stop-word removal algorithm for Arabic language based on a finite state machine (FSM). An efficient stop-word removal technique is needed in many natural language processing application such as: spelling normalization, stemming and stem weighting, Question answering systems and in information retrieval systems (IR). Most of the existing stop-word removal techniques bases on a dictionary that contains a list of stop-word, it is very expensive, it takes too much time for searching process and required too much space to store these stop-words. The new Arabic removal stop-word technique has been tested using a set of 242 Arabic abstracts chosen from the Proceedings of the Saudi Arabian National Computer conferences, and another set of data chosen from the holy Q'uran, and it gives impressive results that reached approximately to 98%.