Sentence Simplification for Semantic Role Labeling

TLDR

Parse‑tree paths are widely used to embed syntactic information into NLP systems, but their atomic features are sparse because of the vast variety of syntactic expressions. The authors propose an iterative sentence‑simplification method to decompose complex syntax into simpler fragments for improved semantic role labeling. The method uses hand‑written syntactic transformation rules weighted by learned preferences, applies all applicable rules to generate candidate simplifications, and jointly trains the simplification and SRL models on SRL data treating simplification as a hidden variable. The combined simplification/SRL system improves generalization across syntactic variation and achieves a statistically significant 1.2 % F1 gain over a strong baseline on the CoNLL‑2005 SRL task, reaching near state‑of‑the‑art performance.

Abstract

Parse-tree paths are commonly used to incorporate information from syntactic parses into NLP systems. These systems typically treat the pathsas atomic(or nearly atomic)features; these features are quite sparse due to the immense variety of syntactic expression. In this paper, we propose a general method for learning how to iteratively simplify a sentence, thus decomposing complicated syntax into small, easy-to-process pieces. Our method applies a series of hand-written transformation rules corresponding to basic syntactic patterns — for example, one rule “depassivizes” a sentence. The model is parameterized by learned weights specifying preferences for some rules over others. After applying all possible transformations to a sentence, we are left with a set of candidate simplified sentences. We apply our simplification system to semantic role labeling (SRL). As we do not have labeled examples of correct simplifications, we use labeled training data for the SRL task to jointly learn both the weights of the simplification model and of an SRL model, treating the simplification as a hidden variable. By extracting and labeling simplified sentences, this combined simplification/SRL system better generalizes across syntactic variation. It achieves a statistically significant 1.2% F1 measure increase over a strong baseline on the Conll2005 SRL task, attaining near-state-of-the-art performance.

References

Page 1

	Year	Citations

Page 1