Publication | Open Access
Automatic acquisition of named entity tagged corpus from world wide web
36
Citations
5
References
2003
Year
Unknown Venue
EngineeringPart-of-speech TaggingAutomatic AcquisitionNe ListSemantic WebSemanticsCorpus LinguisticsText MiningEnough NeNatural Language ProcessingInformation RetrievalData ScienceComputational LinguisticsLanguage StudiesNamed-entity RecognitionEntity DisambiguationKnowledge DiscoveryTerminology ExtractionNe InstancesInformation ExtractionSemantic TaggingLinguistics
In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems. We use an NE list and an web search engine to collect web documents which contain the NE instances. The documents are refined through sentence separation and text refinement procedures and NE instances are finally tagged with the appropriate NE categories. Our experiments demonstrates that the suggested method can acquire enough NE tagged corpus equally useful to the manually tagged one without any human intervention.
| Year | Citations | |
|---|---|---|
Page 1
Page 1