Published March 13, 2019 | Version 1.0
Dataset Open

HaterNet a system for detecting and analyzing hate speech in Twitter

  • 1. Universidad Autonoma de Madrid
  • 2. Copmlutense university of Madrid
  • 3. State Secretariat for Security Interior Ministry, Madrid, Spain

Description

This dataset consists of  two corpuses used in the paper "Detecting and analyzing hate speech in Twitter: HaterNet a system in the Spanish prevention of hate crime office". A first one based on tweets collected at different random dates between February 2017 and December 2017 with a final size of 2 million tweets. A second one with 6,000 tweets labeled as described in the paper as hate containing or not.

Files

labeled_corpus_6K.txt

Files (137.2 MB)

Name Size Download all
md5:22b104d1d1f2f67fcbc616dfbcf7025b
878.4 kB Preview Download
md5:93b37cbdb7262e302b3c9e135fcd0c9c
136.3 MB Preview Download