Publication | Open Access
Foreign Words and the Automatic Processing of Arabic Social Media Text Written in Roman Script
46
Citations
21
References
2014
Year
Unknown Venue
Arabic on social media has all the prop-erties of any language on social media that make it tough for natural language processing, plus some specific problems. These include diglossia, the use of an alternative alphabet (Roman), and code switching with foreign languages. In this paper, we present a system which can process Arabic written in Roman alpha-bet (“Arabizi”). It identifies whether each word is a foreign word or one of an-other four categories (Arabic, name, punc-tuation, sound), and transliterates Arabic words and names into the Arabic alphabet. We obtain an overall system performance of 83.8 % on an unseen test set. 1
| Year | Citations | |
|---|---|---|
Page 1
Page 1