Publication | Open Access
Mining key phrase translations from web corpora
72
Citations
9
References
2005
Year
Unknown Venue
EngineeringSemantic WebCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalComputational LinguisticsKey Phrase TranslationsQuery ExpansionLanguage StudiesMachine TranslationNlp TaskTerminology ExtractionKeyword SearchKey PhrasesKeyword ExtractionLanguage CorpusKey Phrase TranslationLinguistics
Key phrases are usually among the most information-bearing linguistic structures. Translating them correctly will improve many natural language processing applications. We propose a new framework to mine key phrase translations from web corpora. We submit a source phrase to a search engine as a query, then expand queries by adding the translations of topic-relevant hint words from the returned snippets. We retrieve mixed-language web pages based on the expanded queries. Finally, we extract the key phrase translation from the second-round returned web page snippets with phonetic, semantic and frequency-distance features. We achieve 46% phrase translation accuracy when using top 10 returned snippets, and 80% accuracy with 165 snippets. Both results are significantly better than several existing methods.
| Year | Citations | |
|---|---|---|
Page 1
Page 1