Modeling Linguistic Imprints of War Propaganda in a Russian Wikipedia Fork: A Comparative Analysis with the Original Wikipedia

Vestel, Anastasiia; Degaetano-Ortlieb, Stefania

doi:10.5281/zenodo.18959310

Published March 11, 2026 | Version v1

Conference paper Open

Modeling Linguistic Imprints of War Propaganda in a Russian Wikipedia Fork: A Comparative Analysis with the Original Wikipedia

1. Saarland University

Although Wikipedia aspires to provide neutral information, alternative versions can be used for political manipulation. This paper analyzes how narratives about the Russo-Ukrainian War are linguistically reframed in a Russian Wikipedia Fork compared to the original Russian Wikipedia. Using Kullback-Leibler Divergence on a corpus of war-related edits in more than 13,000 articles, we identify key differences between the two versions. While the original Wikipedia features Ukrainian references and administrative details, direct war terminology, and Ukraine’s territorial designation, governance, and statehood, RWFork replaces or removes these elements, emphasizing reassignment of Ukrainian territories to Russia, favoring euphemistic war language, renaming locations, and recognizing Russia-backed DPR and LPR. These patterns closely align RWFork with demobilizational strategies observed in pro-Kremlin media.

Files

LaTeCH_2026_final.pdf

Files (1.0 MB)

Name	Size	Download all
LaTeCH_2026_final.pdf md5:26329d4315281cacf719216f888f1fd7	1.0 MB	Preview Download

Additional details

European Commission
CASCADE - Computational Analysis of Semantic Change Across Different Environments 101119511

Submitted: 2026-01-05
Accepted: 2026-02-04

Kateryna Akhynko, Oleksandr Kosovan, and Mykola Trokhymovych. 2025. Hidden Persuasion: Detecting Manipulative Narratives on Social Media During the 2022 Russian Invasion of Ukraine. In Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025), pages 194–202, Vienna, Austria (online). Association for Computational Linguistics.
Maxim Alyukov, Maria Kunilovskaya, and Andrei Semenov. 2025. Confuse and Normalise: Authoritarian Propaganda in a High-Choice Media Environment and Russia's Invasion of Ukraine. In Paul Goode, editor, Russian Propaganda Today: Challenges, Effectiveness, and Resistance, page in print. University of Michigan press, University of Manchester Press.
Maxim Alyukov, Andrei Semenov, and Maria Kunilovskaya. 2022. Propaganda Setbacks and Appropriation of Anti-war language: "Special Military Operation" in Russian Mass Media and Social Networks (February-July 2022). Monitoring Report №1.
Vladimir Bochkarev, Valery D. Solovyev, and Søren Wichmann. 2014. Universals versus Historical Contingencies in Lexical Evolution. Journal of The Royal Society Interface, 11(101):20140841.
Noam Cohen. 2023. Russian Wikipedia's Top Editor Leaves to Launch a Putin-Friendly Clone. Bloomberg.com.
Giovanni Da San Martino, Seunghak Yu, Alberto Barrón-Cedeño, Rostislav Petrov, and Preslav Nakov. 2019. Fine-Grained Analysis of Propaganda in News Article. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5636–5646, Hong Kong, China. Association for Computational Linguistics.
Stefania Degaetano-Ortlieb and Elke Teich. 2019. Toward an Optimal Code for Communication: The Case of Scientific English. Corpus Linguistics and Linguistic Theory, 18(1):175–207.
Peter Fankhauser, Jörg Knappen, and Elke Teich. 2014. Exploring and Visualizing Variation in Language Resources. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 4125–4128, Reykjavik, Iceland. European Language Resources Association (ELRA).
Patrick Gerard, Svitlana Volkova, Louis Penafiel, Kristina Lerman, and Tim Weninger. 2025a. Modeling Information Narrative Evolution on Telegram During the Russia-Ukraine War. Proceedings of the International AAAI Conference on Web and Social Media, 19:602–614.
Patrick Gerard, Tim Weninger, and Kristina Lerman. 2025b. Fear and Loathing on the Frontline: Decoding the Language of Othering by Russia-Ukraine War Bloggers. Proceedings of the International AAAI Conference on Web and Social Media, 19:615–635.
Vitalij Hein. 2023. Propaganda Detection in Russian and American News Coverage about the War in Ukraine through Text Classification. Thesis, Technische Universität Wien.
James M. Hughes, Nicholas J. Foti, David C. Krakauer, and Daniel N. Rockmore. 2012. Quantitative Patterns of Stylistic Influence in the Evolution of Literature. Proceedings of the National Academy of Sciences, 109(20):7682–7686.
Sara Klingenstein, Tim Hitchcock, and Simon DeDeo. 2014. The Civilizing Process in London's Old Bailey. Proceedings of the National Academy of Sciences of the United States of America, 111(26):9419–9424.
Solomon Kullback and Richard A. Leibler. 1951. On Information and Sufficiency. The Annals of Mathematical Statistics, 22(1):79–86.
Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013,
Chan Young Park, Julia Mendelsohn, Anjalie Field, and Yulia Tsvetkov. 2022. Challenges and Opportunities in Information Manipulation Detection: An Examination of Wartime Russian Media. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5209–5235, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Jules Roscoe. 2024. Russia Clones Wikipedia, Censors It, Bans Original. https://www.404media.co/russiaclones-wikipedia-censors-it-bans-original/.
Claude E. Shannon. 1948. A Mathematical Theory of Communication. Bell System Technical Journal, 27(3):379–423.
Veronika Solopova, Christoph Benzmüller, and Tim Landgraf. 2023. The Evolution of Pro-Kremlin Propaganda From a Machine Learning and Linguistics Perspective. In Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP), pages 40–48, Dubrovnik, Croatia. Association for Computational Linguistics.
Mykola Trokhymovych, Oleksandr Kosovan, Nathan Forrester, Pablo Aragón, Diego Saez-Trumper, and Ricardo Baeza-Yates. 2025. Characterizing Knowledge Manipulation in a Russian Wikipedia Fork. Proceedings of the International AAAI Conference on Web and Social Media, 19:1924–1936.
Taras Ustyianovych and Denilson Barbosa. 2024. Instant Messaging Platforms News Multi-Task Classification for Stance, Sentiment, and Discrimination Detection. In Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024, pages 30–40, Torino, Italia. ELRA and ICCL.
Natalia Vanetik, Marina Litvak, Egor Reviakin, and Margarita Tyamanova. 2023. Propaganda Detection in Russian Telegram Posts in the Scope of the Russian Invasion of Ukraine. In Proceedings of the Conference Recent Advances in Natural Language Processing - Large Language Models for Natural Language Processings, pages 1162–1170. INCOMA Ltd., Shoumen, BULGARIA.
Anastasiia Vestel and Stefania Degaetano-Ortlieb. 2025. From War to Special Military Operation: Interpretable Detection of Linguistic Propaganda Framing in Russian Media. Workshop Proceedings of the 19th International AAAI Conference on Web and Social Media, 2025:50.

	All versions	This version
Views	50	50
Downloads	27	27
Data volume	38.8 MB	38.8 MB

LaTeCH_2026_final.pdf

Files (1.0 MB)

Funding

Dates

References

Modeling Linguistic Imprints of War Propaganda in a Russian Wikipedia Fork: A Comparative Analysis with the Original Wikipedia

Authors/Creators

Description

Files

LaTeCH_2026_final.pdf

Files (1.0 MB)

Additional details

Funding

Dates

References