Publication | Closed Access
Automatically Constructing a Normalisation Dictionary for Microblogs
181
Citations
34
References
2012
Year
Unknown Venue
Microblog normalisation methods often utilise complex models and struggle to differentiate between correctly-spelled unknown words and lexical variants of known words. In this paper, we propose a method for constructing a dictionary of lexical variants of known words that facilitates lexical normalisation via simple string substitution (e.g. tomorrow for tmrw). We use context information to generate possible variant and normalisation pairs and then rank these by string similarity. Highlyranked pairs are selected to populate the dictionary. We show that a dictionary-based approach achieves state-of-the-art performance
| Year | Citations | |
|---|---|---|
Page 1
Page 1