Recalibrating classifiers for interpretable abusive content detection

Vidgen, Bertie; Staton, Sam; Hale, Scott; Kammar, Ohad; Margetts, Helen; Melham, Tom

doi:10.5281/zenodo.4075461

Published October 9, 2020 | Version v1

Dataset Open

Recalibrating classifiers for interpretable abusive content detection

1. The Alan Turing Institute
2. University of Oxford
3. The Alan Turing Institute & University of Oxford
4. University of Edinburgh

Dataset and code for the paper, 'Recalibrating classifiers for interpretable abusive content detection' by Vidgen et al. (2020) -- to appear at the NLP + CSS workshop at EMNLP 2020.

We provide:

1,000 annotated tweets, sampled using the Davidson classifier with 20 0.05 increments (50 from each) from a dataset of tweets directed against MPs in the UK 2017 General Election
1,000 annotated tweets, sampled using the Perspective classifier with 20 0.05 increments (50 from each) from a dataset of tweets directed against MPs in the UK 2017 General Election
Code for recalibration in R and STAN.
Annotation guidelines for both datasets.

Paper abstract

We investigate the use of machine learning classifiers for detecting online abuse in empirical research. We show that uncalibrated classifiers (i.e. where the 'raw' scores are used) align poorly with human evaluations. This limits their use to understand the dynamics, patterns and prevalence of online abuse. We examine two widely used classifiers (created by Perspective and Davidson et al.) on a dataset of tweets directed against candidates in the UK's 2017 general election.
A Bayesian approach is presented to recalibrate the raw scores from the classifiers, using probabilistic programming and newly annotated data. We argue that interpretability evaluation and recalibration is integral to the application of abusive content classifiers.

Files

Vidgen-etal-recalibration-Davidson-annotations.csv

Files (364.1 kB)

Name	Size	Download all
Vidgen-etal-recalibration-Davidson-annotations.csv md5:863a44f8abe2d3226500df17169fd285	161.5 kB	Preview Download
Vidgen-etal-recalibration_Davidson-instructions.docx md5:a513cbbeb3808e198cdb62dd16d75481	15.8 kB	Download
Vidgen-etal-recalibration_Perspective-annotations.csv md5:1609b72a8909f8ef0429ab7dae5513ed	159.9 kB	Preview Download
Vidgen-etal-recalibration_Perspective-instructions.docx md5:615c21982838750c086671c205b9b322	14.4 kB	Download
Vidgen-etal-recalibration_recalibrationCodeForHateSpeech.R md5:1f58613930820717e9e06fdf0fe14c8a	11.2 kB	Download
Vidgen-etal-recalibrationsigmoid-spline-isotone.stan md5:992980984871d67c903046db91b37048	1.3 kB	Download

Additional details

UK Research and Innovation
Strategic Priorities Fund - AI for Science, Engineering, Health and Government EP/T001569/1

	All versions	This version
Views	404	404
Downloads	458	458
Data volume	43.1 MB	43.1 MB

Recalibrating classifiers for interpretable abusive content detection

Authors/Creators

Description

Files

Vidgen-etal-recalibration-Davidson-annotations.csv

Files (364.1 kB)

Additional details

Funding