A practical part-of-speech tagger

TLDR

The paper presents a hidden Markov model–based part‑of‑speech tagger. It relies only on a lexicon and unlabeled text, employs implementation optimizations for high‑speed operation, and supports phrase recognition, word‑sense disambiguation, and grammatical function assignment. The tagger achieves over 96% accuracy while requiring few resources.

Abstract

We present an implementation of a part-of-speech tagger based on a hidden Markov model. The methodology enables robust and accurate tagging with few resource requirements. Only a lexicon and some unlabeled training text are required. Accuracy exceeds 96%. We describe implementation strategies and optimizations which result in high-speed operation. Three applications for tagging are described: phrase recognition; word sense disambiguation; and grammatical function assignment.

References

Page 1

	Year	Citations

Page 1