Published October 23, 2017 | Version v1
Dataset Restricted

Dutch Audio Description Corpus

  • 1. University of Antwerp


The Dutch Audio Description corpus is the first corpus of its kind and includes the transcribed texts of 39 audio described Dutch films and TV series, in total 154,570 words and 3,074 minutes of video. This Dutch AD corpus was used to extract a series of quantitative data regarding the language of AD, namely frequency counts of parts of speech, words, lemmas, collocations and the calculation of other relevant text statistics such as reading speed, word and sentence length, text readability and type token ratios (a statistical measure reflecting lexical variety). The data registered here include the corpus files (XML-files) of the transcribed audio descriptions, the multimodal concordancer developed for the project and the raw data extracted from the corpus as part of the PHD project during which this corpus was developed. 





The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

Access is restricted and can be granted on demand.

You are currently not logged in. Do you have an account? Log in here