Publication | Open Access
A Hybrid Rule/Model-Based Finite-State Framework for Normalizing SMS Messages
92
Citations
19
References
2010
Year
Unknown Venue
Normalizing Sms MessagesEngineeringCorpus LinguisticsText MiningSpeech RecognitionNatural Language ProcessingSpell CheckingComputational LinguisticsSystems EngineeringLanguage StudiesMachine TranslationSms MessagesData NormalizationNormalization PartComputer ScienceFinite-state SystemSignal ProcessingNeural Machine TranslationText NormalizationSpeech TranslationSpeech ProcessingText ProcessingLinguistics
In recent years, research in natural language processing has increasingly focused on normalizing SMS messages. Different well-defined approaches have been proposed, but the problem remains far from being solved: best systems achieve a 11% Word Error Rate. This paper presents a method that shares similarities with both spell checking and machine translation approaches. The normalization part of the system is entirely based on models trained from a corpus. Evaluated in French by 10-fold-cross validation, the system achieves a 9.3% Word Error Rate and a 0.83 BLEU score.
| Year | Citations | |
|---|---|---|
Page 1
Page 1