Publication | Closed Access
GeoTxt: A scalable geoparsing system for unstructured text geolocation
99
Citations
51
References
2019
Year
EngineeringGeographic Information RetrievalPresent GeotxtSemantic WebLocalizationText MiningNatural Language ProcessingGeographic Information SystemsInformation RetrievalData ScienceEntity RecognitionPublic HealthNamed-entity RecognitionEntity DisambiguationComputer ScienceGeographical Text AnalysisScalable GeoparsingGeospatial SemanticsScalable Geoparsing SystemLocation Information
GeoTxt is a scalable geoparsing system that recognizes and geolocates place names in unstructured text. GeoTxt employs six NER algorithms, an enterprise search engine for indexing and ranking toponyms, and a flexible API, and is evaluated on a manually geo‑annotated tweet corpus. GeoTxt achieves a 20% higher toponym resolution accuracy than GeoNames and shows that co‑mentioned places in a tweet are rarely geographically close.
Abstract In this article we present GeoTxt, a scalable geoparsing system for the recognition and geolocation of place names in unstructured text. GeoTxt offers six named entity recognition (NER) algorithms for place name recognition, and utilizes an enterprise search engine for the indexing, ranking, and retrieval of toponyms, enabling scalable geoparsing for streaming text. GeoTxt offers a flexible application programming interface (API), allowing for customized attribute and/or spatial ranking of retrieved toponyms. We evaluate the system on a corpus of manually geo‐annotated tweets. First, we benchmark the performance of the six NERs that GeoTxt provides access to. Second, we assess GeoTxt toponym resolution accuracy incrementally, demonstrating improvements in toponym resolution achieved (or not achieved) by adding specific heuristics and disambiguation methods. Compared to using the GeoNames web service, GeoTxt's toponym resolution demonstrates a 20% accuracy gain. Our results show that places mentioned in the same tweet do not tend to be geographically proximate.
| Year | Citations | |
|---|---|---|
Page 1
Page 1