Publication | Closed Access
Prague Dependency Treebank 2.0 (PDT 2.0)
23
Citations
0
References
2006
Year
Syntactic ParsingEngineeringComplex Semantic AnnotationLatest Annotation TechnologyPart-of-speech TaggingDependency LinguisticsCorpus LinguisticsApplied LinguisticsNatural Language ProcessingSyntaxLanguage DocumentationComputational LinguisticsGrammarCorpus AnalysisLanguage StudiesMachine TranslationTreebanksLanguage CorpusPdt 2.0Linguistics
The Prague Dependency Treebank 2.0 (PDT 2.0) contains a large amount of Czech texts with complex and interlinked morphological (two million words), syntactic (1.5 MW) and complex semantic annotation (0.8 MW); in addition, certain properties of sentence information structure and coreference relations are annotated at the semantic level. PDT 2.0 is based on the long-standing Praguian linguistic tradition, adapted for the current Computational Linguistics research needs. The corpus itself uses the latest annotation technology. Software tools for corpus search, annotation and language analysis are included. Extensive documentation (in English) is provided as well.