Publication | Closed Access
Authorship attribution
186
Citations
8
References
2007
Year
Unknown Venue
EngineeringWriter IdentificationCorpus LinguisticsJournalismText MiningNatural Language ProcessingSupport Vector ClassifierInformation RetrievalData ScienceData MiningComputational LinguisticsDocument ClassificationLanguage StudiesContent AnalysisAutomatic ClassificationKnowledge DiscoveryAuthor ProfilingAuthorship AttributionVector Space ModelNonparametric MethodsLinguistics
Authorship attribution is the process of determining the writer of a document. In literature, there are lots of classification techniques conducted in this process. In this paper we explore information retrieval methods such as tf-Idf structure with support vector machines, parametric and nonparametric methods with supervised and unsupervised (clustering) classification techniques in authorship attribution. We performed various experiments with articles gathered from Turkish newspaper Milliyet. We performed experiments on different features extracted from these texts with different classifiers, and combined these results to improve our success rates. We identified which classifiers give satisfactory results on which feature sets. According to experiments, the success rates dramatically changes with different combinations, however the best among them are support vector classifier with bag of words, and Gaussian with function words.
| Year | Citations | |
|---|---|---|
Page 1
Page 1