Published March 1, 2016 | Version 2016-03
Dataset Open

TwiSty: a multilingual Twitter Stylometry corpus for gender and personality profiling

  • 1. CLiPS Research Center, University of Antwerp
  • 2. University of Groningen

Description

TwiSty is a corpus developed for research in author profiling. It contains personality (MBTI) and gender annotations for a total of 18,168 authors spanning six languages. We distribute the Twitter ids of these authors as well as the ids of their available tweets at the time of corpus development. The tweets have undergone language identification and can be found in a Confirmed (as belonging to the language in which the author is situated) and Other category.

Files

twisty.zip

Files (308.4 MB)

Name Size Download all
md5:ce1199cc6ec1635ccfce63e4cc9d63d7
308.4 MB Preview Download

Additional details

Related works

Is supplemented by
Conference paper: gnd:883-383-734-892-8 (gnd)