Publication | Closed Access
Better Word Representations with Recursive Neural Networks for Morphology
810
Citations
35
References
2013
Year
EngineeringCross-lingual RepresentationNeurolinguisticsRare WordsMorphology (Linguistics)Vector-space Word RepresentationsCorpus LinguisticsText MiningWord EmbeddingsNatural Language ProcessingData ScienceComputational LinguisticsLanguage StudiesBetter Word RepresentationsWord RepresentationsLanguage ModelsMachine TranslationNlp TaskMorphologyMorphological AnalysisDeep LearningLinguisticsPo Tagging
Vector-space word representations have improved many NLP tasks, but they treat words as independent, ignoring morphological relations, which leads to poor estimates for rare and complex words and crude representations for unknown words. The paper proposes a model that constructs representations for morphologically complex words from their morphemes. The model uses recursive neural networks over morphemes combined with neural language models to incorporate contextual information. The learned models outperform existing word representations on word similarity tasks across multiple datasets, including a newly introduced rare‑word dataset.
Vector-space word representations have been very successful in recent years at improving performance across a variety of NLP tasks. However, common to most existing work, words are regarded as independent entities without any explicit relationship among morphologically related words being modeled. As a result, rare and complex words are often poorly estimated, and all unknown words are represented in a rather crude way using only one or a few vectors. This paper addresses this shortcoming by proposing a novel model that is capable of building representations for morphologically complex words from their morphemes. We combine recursive neural networks (RNNs), where each morpheme is a basic unit, with neural language models (NLMs) to consider contextual information in learning morphologicallyaware word representations. Our learned models outperform existing word representations by a good margin on word similarity tasks across many datasets, including a new dataset we introduce focused on rare words to complement existing ones in an interesting way.
| Year | Citations | |
|---|---|---|
Page 1
Page 1