Published July 6, 2020 | Version v1
Video/Audio Open

VariaNTS corpus: A spoken Dutch corpus containing talker and linguistic variability

  • 1. University Medical Center Groningen

Description

The VariaNTS (Variatie in Nederlandse Taal en Sprekers) corpus is a Dutch spoken corpus that was developed to maximize both linguistic and talker variability. It contains 1000 stimulus materials from 11 linguistic subcategories, recorded by 8 male and 8 female native speakers of standard Dutch. The corpus contains audio recordings, orthographic transcriptions, stimulus-specific details such as word frequencies, neighborhood densities and phonotactic probabilities, and talker details. The VariaNTS corpus aims to provide new materials to be used for broad assessment of speech perception and word recognition in Dutch clinical and academic settings. 

Notes

This work was primarily supported by a VICI grant (918-17-603) from the Netherlands Organization for Scientific Research (NWO) and the Netherlands Organization for Health Research and Development (ZonMw) to Deniz Başkent and a VENI grant (275-89-035) from the NWO to Terrin N. Tamati, and funds from the Heinsius Houbolt Foundation, Amsterdam, The Netherlands. The study is part of the research program of the Otorhinolaryngology Department of the University Medical Center Groningen: Healthy Aging and Communication.

Files

VariaNTS corpus.zip

Files (1.3 GB)

Name Size Download all
md5:4afd4597180d064d7856e8f56b19500b
1.3 GB Preview Download