Publication | Closed Access
A hybrid method for Chinese address segmentation
26
Citations
17
References
2017
Year
Statistical MethodsOptical Character RecognitionGeographic Information RetrievalChinese Address SegmentationText SegmentationGeographyTypical Statistical MethodsEast Asian LanguagesCharacter RecognitionLocation InformationMachine TranslationSpeech Recognition
Chinese address segmentation is a serious challenge in geographic information system geocoding. Most previous studies have relied on predefined gazetteers without considering the information contained by a raw address corpus. In this paper, a hybrid method employing both rule-based and statistical methods is proposed for Chinese address segmentation without a predefined gazetteer. This approach utilizes statistical methods to extract address information from a raw address corpus and a rule-based method to segment Chinese addresses. Two typical statistical methods and their combinations with rule-based methods are compared with the hybrid method in an experiment involving approximately 460,000 address items in Shenzhen City, China. The experimental results indicate that the proposed method achieves an F-score of over 0.8, which is better than those of existing methods, thus validating the proposed method.
| Year | Citations | |
|---|---|---|
Page 1
Page 1