Publication | Open Access
Representation Degeneration Problem in Training Natural Language\n Generation Models
105
Citations
19
References
2019
Year
We study an interesting problem in training neural network-based models for\nnatural language generation tasks, which we call the \\emph{representation\ndegeneration problem}. We observe that when training a model for natural\nlanguage generation tasks through likelihood maximization with the weight tying\ntrick, especially with big training datasets, most of the learnt word\nembeddings tend to degenerate and be distributed into a narrow cone, which\nlargely limits the representation power of word embeddings. We analyze the\nconditions and causes of this problem and propose a novel regularization method\nto address it. Experiments on language modeling and machine translation show\nthat our method can largely mitigate the representation degeneration problem\nand achieve better performance than baseline algorithms.\n
| Year | Citations | |
|---|---|---|
Page 1
Page 1