Publication | Open Access
TnT - A Statistical Part-of-Speech Tagger
325
Citations
7
References
2000
Year
EngineeringSpeech CorpusPart-of-speech TaggingCorpus LinguisticsText MiningSpeech RecognitionNatural Language ProcessingInformation RetrievalData ScienceComputational LinguisticsLanguage EngineeringLanguage StudiesMachine TranslationNlp TaskKnowledge DiscoveryTerminology ExtractionTested CorporaMaximum Entropy FrameworkStatistical Part-of-speech TaggerUnknown WordsLinguisticsPo Tagging
Trigrams'n'Tags (TnT) is an efficient statistical part‑of‑speech tagger. The authors argue that a Markov‑model tagger performs at least as well as other approaches, describe its basic model and smoothing techniques, and present evaluations on two corpora. The tagger uses a Markov model with smoothing techniques and methods for handling unknown words. A recent comparison shows that TnT performs significantly better than other approaches on the tested corpora.
Trigrams'n'Tags (TnT) is an efficient statistical part-of-speech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov models performs at least as well as other current approaches, including the Maximum Entropy framework. A recent comparison has even shown that TnT performs significantly better for the tested corpora. We describe the basic model of TnT, the techniques used for smoothing and for handling unknown words. Furthermore, we present evaluations on two corpora.
| Year | Citations | |
|---|---|---|
Page 1
Page 1