Publication | Closed Access
Named entity recognition using an HMM-based chunk tagger
720
Citations
21
References
2001
Year
Unknown Venue
EngineeringPart-of-speech TaggingHmm-based Chunk TaggerCorpus LinguisticsText MiningSpeech RecognitionNatural Language ProcessingInformation RetrievalData ScienceHidden Markov ModelComputational LinguisticsEntity RecognitionLanguage StudiesNamed-entity RecognitionMachine TranslationNumerical QuantitiesNlp TaskKnowledge DiscoveryInformation ExtractionLinguisticsChunkingPo Tagging
The study introduces an HMM‑based chunk tagger that builds a named entity recognition system for classifying names, times, and numerical quantities. The HMM integrates four evidence types—deterministic internal features such as capitalization and digits, internal semantic triggers, internal gazetteer entries, and external macro‑context features—to tag entities. On MUC‑6 and MUC‑7 English NE tasks, the system achieved F‑measures of 96.6 % and 94.1 %, outperforming all other machine‑learning and rule‑based approaches.
This paper proposes a Hidden Markov Model (HMM) and an HMM-based chunk tagger, from which a named entity (NE) recognition (NER) system is built to recognize and classify names, times and numerical quantities. Through the HMM, our system is able to apply and integrate four types of internal and external evidences: 1) simple deterministic internal feature of the words, such as capitalization and digitalization; 2) internal semantic feature of important triggers; 3) internal gazetteer feature; 4) external macro context feature. In this way, the NER problem can be resolved effectively. Evaluation of our system on MUC-6 and MUC-7 English NE tasks achieves F-measures of 96.6% and 94.1% respectively. It shows that the performance is significantly better than reported by any other machine-learning system. Moreover, the performance is even consistently better than those based on handcrafted rules.
| Year | Citations | |
|---|---|---|
Page 1
Page 1