Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

Joan Serra; Ilias Leontiadis; Dimitris Spathis; Gianluca Stringhini; Jeremy Blackburn; Athena Vakali

doi:10.5281/zenodo.884044

Published August 31, 2017 | Version v1

Conference paper Open

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

1. Telefonica Research, Spain
2. Aristotle University, Greece
3. University College London, United Kingdom
4. Aristotle University,

Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is he norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or non-blacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class, and then exploit the error of such class-based models to inform a neural network classifier. This way, we shift from the ability to describe seen documents to the ability to predict unseen content. Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive text categorization strategies by 4–11%.

Files

hate-ALW2017.pdf

Files (217.0 kB)

Name	Size	Download all
hate-ALW2017.pdf md5:93cdcefb0b7662571bed7837dda79d51	217.0 kB	Preview Download

167

Views

123

Downloads

Show more details

	All versions	This version
Views	167	167
Downloads	123	123
Data volume	27.1 MB	27.1 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Conference

Annual Meeting of the Association for Computational Linguistics (ACL '17), Workshop on Abusive Language Online (ACL17) , CANADA, 1 - 8 September 2017

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: September 4, 2017
Modified: August 3, 2024

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

Authors/Creators

Description

Files

hate-ALW2017.pdf

Files (217.0 kB)