Published October 30, 2025 | Version 1.0
Dataset Open

DIRT corpus

Contributors

Project leader:

  • 1. ROR icon Ghent University

Description

DIRT (Dutch In Reality TV) is a corpus of transcriptions of Dutch reality shows from the Netherlands and Belgium, which is made available for linguistic research. The corpus and the accompanying metadata are provided as a zip file.

For more information on (i) the contents of the corpus, (ii) how to search the corpus and (iii) the history of the corpus and the DIRT-project, please see the project documentation included in the zip file ("Over DIRT.pdf") or the project website: https://www.dirt.ugent.be/.

If you use the DIRT corpus, please cite the following reference:

Delaby, Gauthier, Lien Hellebaut & Ulrike Vogl. 2025. Het DIRT-corpus: Dutch in reality tv. Available at Zenodo: https://doi.org/10.5281/zenodo.17469487.

Files

DIRT-corpus versie 1.0 - 2025-10-30.zip

Files (2.0 MB)

Name Size Download all
md5:001d6d3380c31e9a558290cc6e72dc93
2.0 MB Preview Download