NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

doi:10.5281/zenodo.6538055

Published May 11, 2022 | Version 1.0

Dataset Open

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

1. Department of Software Engineering, Faculty of Computing, Bayero University Kano, 700241 Kano, Nigeria
2. Department of Information Technology, Faculty of Computing, Bayero University Kano, 700241 Kano, Nigeria
3. Department of Computer Science, Faculty of Computing, Bayero University Kano, 700241 Kano, Nigeria
4. Department of Computer Science, Ahmadu Bello University Zaria, Kaduna, Nigeria

We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria—Hausa, Igbo, Nigerian-Pidgin, and Yorùbá—consisting of around 30,000 annotated tweets per language (except for Nigerian-Pidgin), including a significant fraction of code-mixed tweets.

Notes

This work was carried out with support from Lacuna Fund, an initiative co-founded by The Rockefeller Foundation, Google.org, and Canada's International Development Research Centre. The views expressed herein do not necessarily represent those of Lacuna Fund, its Steering Committee, its funders, or Meridian Institute. We thank Tal Perry for providing the LightTag annotation tool.

Files

isahmadbbr/NaijaSenti-1.0.zip

Files (16.3 MB)

Name	Size	Download all
isahmadbbr/NaijaSenti-1.0.zip md5:58580671facf2172fbfdccf1fcfe8b4d	16.3 MB	Preview Download

Additional details

Is supplement to: https://github.com/isahmadbbr/NaijaSenti/tree/1.0 (URL)

	All versions	This version
Views	278	278
Downloads	25	25
Data volume	456.5 MB	456.5 MB

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Creators

Description

Notes

Files

isahmadbbr/NaijaSenti-1.0.zip

Files (16.3 MB)

Additional details

Related works