Published October 13, 2022 | Version 0.9.5
Dataset Open

IsiZulu News (articles and headlines) and Siswati News (headlines) Corpora - za-isizulu-siswati-news-2022

  • 1. Department of Computer Science, University of Pretoria
  • 2. Open Cities Lab

Description

IsiZulu News (articles and headlines) and Siswati News (headlines) Corpora - za-isizulu-siswati-news-2022

Reference paper

Madodonga, A., Marivate, V., & Adendorff, M. (2023). Izindaba-Tindzaba: Machine learning news categorisation for Long and Short Text for isiZulu and Siswati. Journal of the Digital Humanities Association of Southern Africa4(01). https://doi.org/10.55492/dhasa.v4i01.4449

 

> @article{Madodonga_Marivate_Adendorff_2023, title={Izindaba-Tindzaba: Machine learning news categorisation for Long and Short Text for isiZulu and Siswati}, volume={4}, url={https://upjournals.up.ac.za/index.php/dhasa/article/view/4449}, DOI={10.55492/dhasa.v4i01.4449}, author={Madodonga, Andani and Marivate, Vukosi and Adendorff, Matthew}, year={2023}, month={Jan.} }

Notes

Dataset for both isiZulu news (articles and headlines) and Siswati news headlines. Process included scraping the data from internet, from Isolezwe news website http://www.isolezwe.co.za and public posts from the SABC news LigwalagwalaFM Facebook page https://www.facebook.com/ligwalagwalafm/ respectively.

Files

dsfsi/za-isizulu-siswati-news-2022-v0.9.5.zip

Files (66.0 kB)

Name Size Download all
md5:df411ffc02a941b777d54d143bb19a8f
66.0 kB Preview Download

Additional details