Publication | Open Access
THUMT: An Open Source Toolkit for Neural Machine Translation
88
Citations
12
References
2017
Year
EngineeringMachine LearningMultilingual PretrainingLarge Language ModelCorpus LinguisticsLanguage ProcessingSpeech RecognitionNatural Language ProcessingOpen Source ToolkitComputational LinguisticsLanguage StudiesMachine TranslationComputer-assisted TranslationMinimum Risk TrainingMultimodal TranslationDeep LearningNeural Machine TranslationSpeech TranslationMinimum RiskLinguistics
This paper introduces THUMT, an open-source toolkit for neural machine translation (NMT) developed by the Natural Language Processing Group at Tsinghua University. THUMT implements the standard attention-based encoder-decoder framework on top of Theano and supports three training criteria: maximum likelihood estimation, minimum risk training, and semi-supervised training. It features a visualization tool for displaying the relevance between hidden states in neural networks and contextual words, which helps to analyze the internal workings of NMT. Experiments on Chinese-English datasets show that THUMT using minimum risk training significantly outperforms GroundHog, a state-of-the-art toolkit for NMT.
| Year | Citations | |
|---|---|---|
Page 1
Page 1