SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)
Creators
- 1. Rochester Institute of Technology
- 2. Qatar Computing Research Institute, HBKU
- 3. IBM Research
- 4. University of Copenhagen
- 5. University of Cambridge
- 6. 2Qatar Computing Research Institute, Qatar
- 7. 6 IT University Copenhagen, Denmark
- 8. University of Wolverhampton, UK
- 9. University of Tubingen, Germany
Description
The task involves three subtasks corresponding to the hierarchical taxonomy of the OLID schema (Zampieri et al., 2019) from OffensEval 2019. The task featured five languages and this upload is for the English language. In addition, English also featured Subtasks B and C. OffensEval 2020 was one of the most popular tasks at SemEval-2020 attracting a large number of participants across all subtasks and also across all languages. A total of 528 teams signed up to participate in the task, 145 teams submitted systems during the evaluation period, and 70 submitted system description papers.
This upload includes a test set used in the paper describing the dataset used in the shared task as well as the official test set used in the shared task.
The evaluation phase for English is available on Codalab: https://competitions.codalab.org/competitions/23285
The Website for the shared task is https://sites.google.com/site/offensevalsharedtask/home
Files
extended_test-20200717T190516Z-001.zip
Files
(244.1 MB)
Name | Size | Download all |
---|---|---|
md5:72a47ea414eaa6116075d43810b6eb00
|
469.3 kB | Preview Download |
md5:c4026ffda9998603b9d9d94ce16e2692
|
8.6 kB | Preview Download |
md5:7476d6555ea05374002cfd0a46667a0b
|
261.7 kB | Preview Download |
md5:4ff61a2c75f36e91d7b6616af2684859
|
226.9 MB | Preview Download |
md5:2bf18f6eb890ac7fc787897a79a7bc48
|
4.8 MB | Preview Download |
md5:f34dc50cffed1313026b29b91173a78d
|
11.6 MB | Preview Download |
Additional details
References
- Rosenthal, Sara, et al. "A large-scale semi-supervised dataset for offensive language identification." arXiv preprint arXiv:2004.14454 (2020).
- Zampieri, Marcos, et al. "SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)." arXiv preprint arXiv:2006.07235 (2020).