Published September 22, 2025 | Version v1
Dataset Open

Article based topic modeling on re-processed historical newspapers (Dataset)

Description


This data repository contains raw data and the topic modeling results pertaining to the paper "Article based topic modeling on re-processed historical newspapers" by Kara Kuebart, Christian Schultze and Felix Selgert, published in DHNB 2025.

All code used in this project can be found under https://github.com/KaraKuebart/topic_model_reprocessed_newspapers.

For further information, please refer to the README.md

Files

README.md

Files (29.9 GB)

Name Size Download all
md5:0cc2c09fb25e03abb60b3f5860be1d2b
5.6 kB Preview Download
md5:26b80ff803437fa04c79ab23376db611
13.6 GB Preview Download
md5:52a18ffa52baf965edcb97201ceee2f6
16.2 GB Preview Download

Additional details

Related works

Cites
Journal article: arXiv:2401.16845v2 (arXiv)

Software