SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection
- 1. Queen Mary University of London
- 2. Oxford Brookes University
Description
Our repository presents the Sina Weibo Sexism Review (SWSR) dataset containing sexism-related posts in Chinese collected from Sina Weibo, as well as the Chinese lexicon SexHateLex.
SWSR dataset consists of two files: SexWeibo.csv and SexComment.csv, and SexHateLex lexicon contains a list of 3016 abusive terms in the file SexHateLex.txt.
Our work has been published in the Journal of Online Social Networks and Media. If you are interested in this dataset, please cite:
Aiqi Jiang, Xiaohan Yang, Yang Liu, Arkaitz Zubiaga, SWSR: A Chinese dataset and lexicon for online sexism detection, Online Social Networks and Media, Volume 27, 2022, 100182, ISSN 2468-6964, https://doi.org/10.1016/j.osnem.2021.100182.
If you have any queries or suggestions about our work, please contact us via a.jiang@qmul.ac.uk. We also welcome any ideas or cooperation related to Chinese sexist speech.
Files
README.md
Additional details
References
- A. Jiang, X. Yang, Y. Liu and A. Zubiaga (2021). SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection. Under review.