Dataset Open Access
Despoina Chatzakou; Nicolas Kourtellis; Jeremy Blackburn; Emiliano De Cristofaro; Gianluca Stringhini; Athena Vakali
In recent years, bullying and aggression against social media users have grown significantly, causing serious consequences to victims of all demographics. Nowadays, cyberbullying affects more than half of young social media users worldwide, suering from prolonged and/or coordinated digital harassment. Also, tools and technologies geared to understand and mitigate it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behaviour on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of bullies and aggressors, and what features distinguish them from regular users. We nd that bullies post less, participate in fewer online communities, and are less popular than normal users. Aggressors are relatively popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classication algorithms can accurately detect users exhibiting bullying and aggressive behaviour, with over 90% AUC.
Name | Size | |
---|---|---|
websci_dataset.zip
md5:2530ebb7c3b90412eb91d3a5f1c0b590 |
89.2 kB | Download |
All versions | This version | |
---|---|---|
Views | 2,652 | 2,650 |
Downloads | 723 | 723 |
Data volume | 64.5 MB | 64.5 MB |
Unique views | 2,460 | 2,458 |
Unique downloads | 704 | 704 |