Published May 6, 2022 | Version 1
Dataset Open

Dataset: Characterizing Anti-Asian Rhetoric During The COVID-19 Pandemic: A Sentiment Analysis Case Study on Twitter

  • 1. Georgia State University
  • 2. University of Waterloo
  • 3. University of California, Berkeley
  • 4. Coronavirus Visualization Team

Description

This is the dataset, trained model, and software companion for the paper titled: Characterizing Anti-Asian Rhetoric During The COVID-19 Pandemic: A Sentiment Analysis Case Study on Twitter accepted for the Workshop on Data for the Wellbeing of Most Vulnerable of the ICWSM 2022 conference.

The COVID-19 pandemic has shown a measurable increase in the usage of sinophobic comments or terms on online social media platforms. In the United States, Asian Americans have been primarily targeted by violence and hate speech stemming from negative sentiments about the origins of the novel SARS-CoV-2 virus. While most published research focuses on extracting these sentiments from social media data, it does not connect the specific news events during the pandemic with changes in negative sentiment on social media platforms. In this work we combine and enhance publicly available resources with our own manually annotated set of tweets to create machine learning classification models to characterize the sinophobic behavior. We then applied our classifier to a pre-filtered longitudinal dataset spanning two years of pandemic related tweets and overlay our findings with relevant news events.

Files

Readme.pdf

Files (1.2 GB)

Name Size Download all
md5:d2e9ca2bd5078bcb007f168798d0f30d
4.7 kB Download
md5:8ece887971d16743b83eef42b3e52648
5.4 kB Download
md5:b944a9c86d2d26e53861c8079c0ec743
1.2 GB Download
md5:b820032b8b6f239e0afcce79af1ab3a6
845.7 kB Download
md5:ee38f401427e345e7e93b7a43b216280
4.1 kB Download
md5:16ea1c770535b5d705d25a0ee0381103
64.4 kB Preview Download

Additional details

Related works

Is published in
Conference paper: 10.36190/2022.81 (DOI)