Published September 24, 2025 | Version v1
Preprint Open

Sentiment and Sarcasm Detection in Bangla: Reactions to Bangladesh's 2023 World Cup

  • 1. ROR icon University of Chittagong
  • 2. Chittagong Independent University

Description

Sentiment and sarcasm detection in Bangla presents unique challenges due to its rich morphology, culturally specific expressions, and limited annotated resources. This study introduces a dual-head classification framework based on BanglaBERT—a transformer model pretrained on Bangla corpora—to jointly detect sentiment polarity and sarcasm in 5,635 manually annotated social media comments from the 2023 ICC Cricket World Cup. The framework mitigates class imbalance using focal loss, inverse frequency-based class weighting, and multilabel stratified cross-validation. Evaluation shows weighted F1 scores of 0.89 for sentiment and 0.84 for sarcasm, with notable gains in minority classes such as neutral sentiment (F1 0.69) and sarcastic remarks (F1 0.60). A real-time Gradio interface demonstrates the system’s utility for social media analytics. Limitations include potential annotation biases and restricted generalizability beyond sports. The study contributes a novel annotated dataset, a reproducible pipeline, and insights into culturally adaptive modeling strategies for nuanced language tasks in low-resource NLP.

Notes

Disclaimer: This is the author’s version of the manuscript submitted to ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
The final Version of Record will be available via ACM Digital Library.

Files

main-document.pdf

Files (747.4 kB)

Name Size Download all
md5:bddcfca01b6413fa94314799339d8566
747.4 kB Preview Download