Publication | Closed Access
Towards the Annotation of Penn TreeBank with Information Structure
12
Citations
16
References
2013
Year
Syntactic ParsingEngineeringTaggingCorpus LinguisticsLanguage ProcessingNatural Language ProcessingApplied LinguisticsSyntaxData-driven NlpComputational LinguisticsPenn TreebankGrammarCorpus AnalysisLanguage StudiesMachine TranslationMeaning-text TheorySemantic ParsingShallow ParsingParsingTreebanksLinguistics
Information Structure (IS) determines the “communicative” segmentation of the meaning of an utterance, which makes it central to the semantics‐syntax‐ intonation interface and therefore also to NLP. Despite this relevance, IS has not received much attention in the context of the majority of the reference treebanks for data-driven NLP that already contain a semantic and syntactic layers of annotation. We present our work in progress on the annotation of the Penn TreeBank with the thematicity dimension of the IS as defined in the Meaning-Text Theory. We experiment with tagging and transitionbased parsing techniques. Especially the latter achieve acceptable accuracy with even very small training samples, which is promising for languages with scarce resources.
| Year | Citations | |
|---|---|---|
Page 1
Page 1