Concepedia

Publication | Closed Access

Empirical Evaluation of Profile Characteristics for Gender Classification on Twitter

60

Citations

15

References

2013

Year

Abstract

Online Social Networks (OSNs) provide reliable communication among users from different countries. The volume of texts generated by OSNs is huge and highly informative. Gender classification can serve commercial organizations for advertising, law enforcement for legal investigation, and others for social reasons. Here we explore profile characteristics for gender classification on Twitter. Unlike existing approaches to gender classification that depend heavily on posted text such as tweets, here we study the relative strengths of different characteristics extracted from Twitter profiles (e.g., first name and background color in a user's profile page). Our goal is to evaluate profile characteristics with respect to their predictive accuracy and computational complexity. In addition, we provide a novel technique to reduce the number of features of text-based profile characteristics from the order of millions to a few thousands and, in some cases, to only 40 features. We prove the validity of our approach by examining different classifiers over a large dataset of Twitter profiles.

References

YearCitations

Page 1