Publication | Closed Access
A survey of types of text noise and techniques to handle noisy text
82
Citations
32
References
2009
Year
Unknown Venue
EngineeringCommunicationCorpus LinguisticsText MiningNoise ReductionSpeech RecognitionNatural Language ProcessingNoisy TextInformation RetrievalData ScienceText NoiseText SegmentationText RecognitionComputational LinguisticsNoiseLanguage StudiesContent AnalysisMachine TranslationOptical Character RecognitionReal World NoiseNoisy DataComputer ScienceInformation ExtractionText NormalizationSpeech ProcessingText ProcessingDocument Processing
Often, in the real world noise is ubiquitous in text communications. Text produced by processing signals intended for human use are often noisy for automated computer processing. Automatic speech recognition, optical character recognition and machine translation all introduce processing noise. Also digital text produced in informal settings such as online chat, SMS, emails, message boards, newsgroups, blogs, wikis and web pages contain considerable noise. In this paper, we present a survey of the existing measures for noise in text. We also cover application areas that ingest this noisy text for various tasks like Information Retrieval and Information Extraction.
| Year | Citations | |
|---|---|---|
Page 1
Page 1