Domain adaptation with structural correspondence learning

TLDR

Discriminative learning methods are widely used in NLP but perform best when training and test data come from the same distribution, a challenge when labeled data is scarce in new domains. The goal is to adapt models from a resource‑rich source domain to a resource‑poor target domain. Structural correspondence learning automatically induces correspondences among features from different domains. Applying this technique to part‑of‑speech tagging yields performance gains across varying amounts of source and target data and improves target‑domain parsing accuracy with the enhanced tagger.

Abstract

Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For many NLP tasks, however, we are confronted with new domains in which labeled data is scarce or non-existent. In such cases, we seek to adapt existing models from a resource-rich source domain to a resource-poor target domain. We introduce structural correspondence learning to automatically induce correspondences among features from different domains. We test our technique on part of speech tagging and show performance gains for varying amounts of source and target training data, as well as improvements in target domain parsing accuracy using our improved tagger.

References

Page 1

	Year	Citations

Page 1