Published May 11, 2020 | Version v1
Conference paper Open

A Comparative Study of Different State-of-the-Art Hate Speech Detection Methods in Hindi-English Code-Mixed Data

Description

Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities. In an environment where multilingual speakers switch among multiple languages, hate speech detection becomes a challenging task using methods that are designed for monolingual corpora. In our work, we attempt to analyze, detect and provide a comparative study of hate speech in a code-mixed social media text. We also provide a Hindi-English code-mixed data set consisting of Facebook and Twitter posts and comments. Our experiments show that deep learning models trained on this code-mixed corpus perform better.

Files

rani2020comparative.pdf

Files (230.7 kB)

Name Size Download all
md5:a0a57894710d20c856f1042a48db1355
230.7 kB Preview Download

Additional details

Funding

ELEXIS – European Lexicographic Infrastructure 731015
European Commission
Pret-a-LLOD – Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors 825182
European Commission