Published February 3, 2026 | Version v1
Dataset Open

Diverse Narratives and International Perspectives on the Russo-Ukrainian Offensive (DNIPRO)

Description

DNIPRO is a longitudinal corpus of 246,229 news articles documenting the Russo-Ukrainian war from February 2022 to August 2024. The dataset captures competing geopolitical narratives from 11 media outlets across five nation-states (Russia, Ukraine, U.S., U.K., China) in three languages (English, Russian, Mandarin Chinese).
The corpus includes comprehensive metadata, named entity annotations, sentiment scores for key actors and events, and topical framing labels based on a nine-category schema. All annotations are validated by human annotators. The dataset is designed to support research in computational journalism, media framing analysis, information warfare studies, and cross-lingual narrative analysis.
Note: Due to potential copyright restrictions, this release contains metadata and annotations only. Full article text must be retrieved via the provided URLs. All data collection complied with publishers' terms of service.
For detailed methodology, see the accompanying preprint (arXiv:2601.16309), to be published at the Fifteenth biennial Language Resources and Evaluation Conference (LREC) 2026.
 
Keywords: news corpus, media framing, Russo-Ukrainian war, multilingual, geopolitical narratives, information warfare, sentiment analysis, stance detection

Files

README.md

Files (190.3 MB)

Name Size Download all
md5:f5a53c7ab38ba3772e879f1407d3d412
19.3 kB Preview Download
md5:8ed269756d13149a8cb7cf99673e9e4c
32.9 MB Download
md5:18a7ecb81fa5540494fca63fd02d0b91
149.5 MB Download
md5:22f0c70e36e416e71ee7d4e7eadd4fc6
15.1 kB Preview Download
md5:39c9f71d9468098a1ba2f02c2dc02c5b
32.7 kB Download
md5:ce8bbe0be9dcc6370421347935835964
7.8 MB Download

Additional details

Additional titles

Subtitle (English)
A Longitudinal, Multinational, and Multilingual Corpus of News Coverage of the Russo-Ukrainian War

Software

Repository URL
https://github.com/dikshyam/dnipro-codebase
Programming language
Python
Development Status
Active