Published September 24, 2025
| Version v1
Preprint
Open
Sentiment and Sarcasm Detection in Bangla: Reactions to Bangladesh's 2023 World Cup
Description
Sentiment and sarcasm detection in Bangla presents unique challenges due to its rich morphology, culturally specific expressions, and limited annotated resources. This study introduces a dual-head classification framework based on BanglaBERT—a transformer model pretrained on Bangla corpora—to jointly detect sentiment polarity and sarcasm in 5,635 manually annotated social media comments from the 2023 ICC Cricket World Cup. The framework mitigates class imbalance using focal loss, inverse frequency-based class weighting, and multilabel stratified cross-validation. Evaluation shows weighted F1 scores of 0.89 for sentiment and 0.84 for sarcasm, with notable gains in minority classes such as neutral sentiment (F1 0.69) and sarcastic remarks (F1 0.60). A real-time Gradio interface demonstrates the system’s utility for social media analytics. Limitations include potential annotation biases and restricted generalizability beyond sports. The study contributes a novel annotated dataset, a reproducible pipeline, and insights into culturally adaptive modeling strategies for nuanced language tasks in low-resource NLP.
Notes
Files
main-document.pdf
Files
(747.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:bddcfca01b6413fa94314799339d8566
|
747.4 kB | Preview Download |