Published April 2, 2018 | Version 1
Dataset Open

Deprecated Dataset for "Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior"

  • 1. Founta
  • 2. Djouvas
  • 3. Chatzakou
  • 4. Leontiadis
  • 5. Blackburn
  • 6. Stringhini
  • 7. Vakali
  • 8. Sirivianos
  • 9. Kourtellis

Description

This dataset is deprecated. The updated version of this Dataset is here: https://zenodo.org/record/3678559#.Xl9-Ji97FhE

Dataset for the publication "Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior". Antigoni-Maria Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos and Nicolas Kourtellis. International AAAI Conference on Web and Social Media (ICWSM), 2018.

The dataset provided here includes an updated version of the original dataset, with ~100k tweets annotated using the CrowdFlower platform:

  • hatespeech_labels.csv: contains ~100k rows, where every row consists of a unique Tweet ID and its associated majority annotation

UPDATE: It has come to our understanding that a number of the tweets are not available anymore for download on Twitter. Therefore, upon request, we can provide one more file with the full ~100k tweet text and their associated majority labels. The tweets are shuffled so that there is no connection between tweet IDs and texts (in order to be aligned with the T&C of Twitter).

To obtain the file contact a.m.founta at gmail dot com AND antonis26papa at gmail dot com.

Please cite the paper in any published work that uses any of these resources.

@inproceedings{founta2018large,
    title={Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior},
    author={Founta, Antigoni-Maria and Djouvas, Constantinos and Chatzakou, Despoina and Leontiadis, Ilias and Blackburn, Jeremy and Stringhini, Gianluca and Vakali, Athena and Sirivianos, Michael and Kourtellis, Nicolas},
    booktitle={11th International Conference on Web and Social Media, ICWSM 2018},
    year={2018},
    organization={AAAI Press}
}

For any further questions contact a.m.founta at gmail dot com.

 

Publication DOI: https://doi.org/10.5281/zenodo.1443348

Github: https://github.com/ENCASEH2020/hatespeech-twitter

Files

Files (727.7 kB)

Name Size Download all
md5:b7c4230dbb6dc8612938c85670a98886
727.7 kB Download

Additional details

Funding

ENCASE – EnhaNcing seCurity And privacy in the Social wEb: a user centered approach for the protection of minors 691025
European Commission