SportsOri: A Novel Dataset for Analyzing Public Sentiment on Controversial Sports Events in YouTube Comments
Creators
Description
Sports engages billions of followers worldwide and impacts the
economy. Sports controversies often ignite passionate discus-
sions among fans, analysts, and players. With the rise of social
media, platforms like YouTube have become central to these discus-
sions. This study aims to analyze the stances or perform opinion
mining namely for, against, and neutral on comments from fa-
mous social media platforms like YouTube for famous public sports
controversies.
To our knowledge, it is the first-ever study and dataset (hand curated) of civic
engagement in controversial sports events spanning around 40 years.
LLMs (Llama and Deepseek reasoning family) were used for initial
annotations (stance) of comments and later fine-tuned for comparative performance analysis ( 30% boost in accuracy).
This dataset presents a collection of YouTube comments (around 43k) on famous
and controversial Public Sports Events.
We explore public sentiment analysis (stance detection) on a total of 6 famous controversial
sports incidents by extracting and processing YouTube comments.
Stance detection is performed on those events through fine-tuning
of models like Llama-3.1-8b and Deepseek reasoning models (Llama-
8b distilled) on comments from events like The Underarm Incident,
Jonny Bairstow’s Run-Out Incident, Ashwin’s Mankading Event,
Luis Suarez Handball Event etc.
Files
Annotation Pipeline and Fine Tuning Details.pdf
Files
(5.5 MB)
Name | Size | Download all |
---|---|---|
md5:e038abd2c81066bec0d5d1e76fe60528
|
161.6 kB | Preview Download |
md5:29af6b21551a99bc50dfc48a49f69bd2
|
212.3 kB | Preview Download |
md5:e26b65fed3b3dc860fea2ebf26298fd2
|
117.8 kB | Preview Download |
md5:9aeaf47c4d1c2b0c82c711b60c822031
|
138.0 kB | Preview Download |
md5:8080165c13a14735e4fcb5b439343e04
|
1.0 MB | Download |
md5:639c551a18523560d2755aa02cc2698a
|
319.5 kB | Preview Download |
md5:5392cd596138837b48d2189c1cab57a3
|
543.1 kB | Preview Download |
md5:ed894304d3b967d67e2fdaa5935dcd09
|
977.1 kB | Download |
md5:231ef8ab1b543a54048464c5220a9fd3
|
284.9 kB | Preview Download |
md5:d7608839a13aa2ce48e4fbba4c544e0c
|
413.0 kB | Download |
md5:178fc60c850be670d589161391419425
|
245.7 kB | Preview Download |
md5:5d63925f24aef594d8fdd07509fde9f6
|
19.8 kB | Preview Download |
md5:747e0874692c5dfb22b6f0e4869cd3d1
|
4.2 kB | Preview Download |
md5:16aa64aefa589fa339914802e3b5ea03
|
755.8 kB | Preview Download |
md5:548779a46deaeccfb899857c3f0f33b8
|
251.7 kB | Download |
Additional details
Dates
- Collected
-
2024-05Data Collected