Publication | Open Access
GutenTag: an NLP-driven Tool for Digital Humanities Research in the Project Gutenberg Corpus
49
Citations
13
References
2015
Year
Unknown Venue
This paper introduces a software tool, GutenTag, which is aimed at giving literary researchers direct access to NLP techniques for the analysis of texts in the Project Gutenberg corpus. We discuss several facets of the tool, including the handling of formatting and structure, the use and expansion of metadata which is used to identify relevant subcorpora of interest, and a general tagging framework which is intended to cover a wide variety of future NLP modules. Our hope that the shared ground created by this tool will help create new kinds of interaction between the computational linguistics and digital humanities communities, to the benefit of both.
| Year | Citations | |
|---|---|---|
Page 1
Page 1