Sequence-to-Sequence Learning as Beam-Search Optimization

Abstract

Sequence-to-Sequence (seq2seq) modeling has rapidly become an important generalpurpose NLP tool that has proven effective for many text-generation and sequence-labeling tasks. Seq2seq builds on deep neural language modeling and inherits its remarkable accuracy in estimating local, next-word distributions. In this work, we introduce a model and beamsearch training scheme, based on the work of This structured approach avoids classical biases associated with local training and unifies the training loss with the test-time usage, while preserving the proven model architecture of seq2seq and its efficient training approach. We show that our system outperforms a highlyoptimized attention-based seq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine translation.

References

Page 1

	Year	Citations

Page 1