Published December 30, 2025 | Version 2025.12.30.2
Software Open

Journal Digital Corpus: Swedish Newsreel Transcriptions

  • 1. Lund University

Description

A curated, timestamped transcription corpus derived from Swedish historical newsreels. Combines speech-to-text transcriptions and intertitle OCR from SF Veckorevy newsreels spanning five decades of 20th-century Swedish audiovisual media.

Notes

If you use this corpus, please cite both the data paper and the repository.

Files

Modern36/journal_digital_corpus-2025.12.30.2.zip

Files (13.7 MB)

Name Size Download all
md5:b242119a860811d1d2d7776c4e525186
13.7 MB Preview Download

Additional details

Related works

Is described by
Data paper: 10.5334/johd.344 (DOI)
Is source of
Dataset: 10.5281/zenodo.18131655 (DOI)
Software: https://github.com/Modern36/jdc_reader (URL)
Is supplement to
Software: https://github.com/Modern36/journal_digital_corpus/tree/2025.12.30.2 (URL)