Generative and Discriminative Text Classification with Recurrent Neural\n Networks

Abstract

We empirically characterize the performance of discriminative and generative\nLSTM models for text classification. We find that although RNN-based generative\nmodels are more powerful than their bag-of-words ancestors (e.g., they account\nfor conditional dependencies across words in a document), they have higher\nasymptotic error rates than discriminatively trained RNN models. However we\nalso find that generative models approach their asymptotic error rate more\nrapidly than their discriminative counterparts---the same pattern that Ng &\nJordan (2001) proved holds for linear classification models that make more\nnaive conditional independence assumptions. Building on this finding, we\nhypothesize that RNN-based generative classification models will be more robust\nto shifts in the data distribution. This hypothesis is confirmed in a series of\nexperiments in zero-shot and continual learning settings that show that\ngenerative models substantially outperform discriminative models.\n