Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published March 7, 2019 | Version 1.0.0
Dataset Open

YouToxic English

  • 1. Graz University of Technology
  • 2. Know-Center GmbH

Description

This is a hand-labeled toxicity data set containing 1000 comments crawled from YouTube videos about the Ferguson unrest in 2014. In addition to toxicity, this data set contains labels for multiple subclassifications of toxicity which form a hierarchical structure. Each comment can have multiple of these labels assigned. The structure can be seen in the following enumeration:

  • IsToxic
    • IsAbusive
      • IsThreat
      • IsProvocative
      • IsObscene
    • IsHatespeech
      • IsRacist
      • IsNationalist
      • IsSexist
      • IsHomophobic
      • IsReligiousHate
    • IsRadicalism

Files

youtoxic_english_1000.csv

Files (293.1 kB)

Name Size Download all
md5:700ad695e016f1b28e03867d7ed2f0c0
293.1 kB Preview Download