Publication | Open Access
Location normalization for information extraction
93
Citations
9
References
2002
Year
Unknown Venue
EngineeringGeographic Information RetrievalSemanticsLocalizationText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningComputational LinguisticsLanguage StudiesLinguisticsKnowledge DiscoveryU.s. Country NamesComputer ScienceLocation NormalizationLocation NamesInformation ExtractionDistributional SemanticsGeospatial SemanticsData ExtractionLocation InformationWord-sense Disambiguation
Ambiguity is very high for location names. For example, there are 23 cities named 'Buffalo' in the U.S. Country names such as 'Canada', 'Brazil' and 'China' are also city names in the USA. Almost every city has a Main Street or Broadway. Such ambiguity needs to be handled before we can refer to location names for visualization of related extracted events. This paper presents a hybrid approach for location normalization which combines (i) lexical grammar driven by local context constraints, (ii) graph search for maximum spanning tree and (iii) integration of semi-automatically derived default senses. The focus is on resolving ambiguities for the following types of location names: island, town, city, province, and country. The results are promising with 93.8% accuracy on our test collections.
| Year | Citations | |
|---|---|---|
Page 1
Page 1