Publication | Closed Access
An efficient memory-based morphosyntactic tagger and parser for Dutch
158
Citations
9
References
2007
Year
Syntactic ParsingK-nearest Neighbor ClassificationTaggingEngineeringPart-of-speech TaggingCorpus LinguisticsLanguage ProcessingText MiningApplied LinguisticsNatural Language ProcessingSyntaxComputational LinguisticsGrammarCorpus AnalysisLanguage StudiesMachine TranslationMemory UsageParsingTreebanksDependency ParserLinguisticsPo Tagging
We describe TADPOLE, a modular memory-based morphosyntactic tagger and dependency parser for Dutch. Though primarily aimed at being accurate, the design of the system is also driven by optimizing speed and memory usage, using a trie-based approximation of k-nearest neighbor classification as the basis of each module. We perform an evaluation of its three main modules: a part-of-speech tagger, a morphological analyzer, and a dependency parser, trained on manually annotated material available for Dutch – the parser is additionally trained on automatically parsed data. A global analysis of the system shows that it is able to process text in linear time close to an estimated 2,500 words per second, while maintaining sufficient accuracy.
| Year | Citations | |
|---|---|---|
Page 1
Page 1