Published August 31, 2017 | Version v1
Conference paper Open

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

  • 1. Telefonica Research, Spain
  • 2. Aristotle University, Greece
  • 3. University College London, United Kingdom
  • 4. Aristotle University,

Description

Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is he norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or non-blacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class, and then exploit the error of such class-based models to inform a neural network classifier. This way, we shift from the ability to describe seen documents to the ability to predict unseen content. Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive text categorization strategies by 4–11%.

Files

hate-ALW2017.pdf

Files (217.0 kB)

Name Size Download all
md5:93cdcefb0b7662571bed7837dda79d51
217.0 kB Preview Download