Research Article Open Access

Evaluation and Sociolinguistic Analysis of Text Features for Gender and Age Identification

Vasiliki Simaki1, Iosif Mporas2 and Vasileios Megalooikonomou3
  • 1 Lund University, Sweden
  • 2 University of Hertfordshire, United Kingdom
  • 3 University of Patras, Greece

Abstract

The paper presents an interdisciplinary study in the field of automatic gender and age identification, under the scope of sociolinguistic knowledge on gendered and age linguistic choices that social media users make. The authors investigated and gathered standard and novel text features used in text mining approaches on the author's demographic information and profiling and they examined their efficacy in gender and age detection tasks on a corpus consisted of social media texts. An analysis of the most informative features is attempted according to the nature of each feature and the information derived after the characteristics' score of importance is discussed.

American Journal of Engineering and Applied Sciences
Volume 9 No. 4, 2016, 868-876

DOI: https://doi.org/10.3844/ajeassp.2016.868.876

Submitted On: 3 August 2016 Published On: 25 September 2016

How to Cite: Simaki, V., Mporas, I. & Megalooikonomou, V. (2016). Evaluation and Sociolinguistic Analysis of Text Features for Gender and Age Identification. American Journal of Engineering and Applied Sciences, 9(4), 868-876. https://doi.org/10.3844/ajeassp.2016.868.876

  • 4,718 Views
  • 2,670 Downloads
  • 2 Citations

Download

Keywords

  • Sociolinguistics
  • Text Mining
  • Feature Ranking
  • ReliefF Algorithm
  • Gender Detection
  • Age Identification