Publication | Closed Access
Multilingual author profiling using word embedding averages and SVMs
23
Citations
16
References
2016
Year
Unknown Venue
EngineeringCross-lingual RepresentationCross Genre EvaluationWord VectorsCorpus LinguisticsJournalismText MiningWord EmbeddingsNatural Language ProcessingApplied LinguisticsComputational Social ScienceSocial MediaComputational LinguisticsWord Embedding AveragesLanguage StudiesContent AnalysisSocial Medium MiningMachine TranslationAuthor ProfilingSocial Medium DataLinguistics
This paper describes an experiment done to investigate author profiling of tweets in English and Spanish, particularly for cross genre evaluation. Profiling consists of age and gender classification. The training sets were taken from tweets while genres for evaluation come from blogs, hotel reviews, other tweets collected in a different time, as well as other social media. Comparisons were done between tfidf as a baseline and average of word vectors, using a Support Vector Machine algorithm. Results show that using average of word vectors outperforms tfidf in most cross genre problems for age and gender.
| Year | Citations | |
|---|---|---|
Page 1
Page 1