Publication | Closed Access
N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION
382
Citations
11
References
2003
Year
Unknown Venue
EngineeringGenerated Author ProlesComputer-assisted Authorship AttributionWriter IdentificationCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalData ScienceComputational LinguisticsLanguage EngineeringLanguage StudiesMachine TranslationKnowledge DiscoveryAuthor ProfilingAuthorship AttributionContent Similarity DetectionText ProcessingLinguistics
We present a novel method for computer-assisted authorship attribution based on characterlevel n-gram author proles, which is motivated by an almost-forgotten, pioneering method in 1976. The existing approaches to automated authorship attribution implicitly build author proles as vectors of feature weights, as language models, or similar. Our approach is based on byte-level n-grams, it is language independent, and the generated author proles are limited in size. The eectiveness of the approach and language independence are demonstrated in experiments performed on English, Greek, and Chinese data. The accuracy of the results is at the level of the current state of the art approaches or higher in some cases.
| Year | Citations | |
|---|---|---|
Page 1
Page 1