Published July 25, 2020 | Version v1
Dataset Open

DiPCo -- Dinner Party Corpus


We present a speech data corpus that simulates a "dinner party" scenario taking place in an everyday home environment. The corpus was created by recording multiple groups of four Amazon employee volunteers having a natural conversation in English around a dining table. The participants were recorded by a single-channel close-talk microphone and by five far-field 7-microphone array devices positioned at different locations in the recording room. The dataset contains the audio recordings and human labeled transcripts of a total of 10 sessions with a duration between 15 and 45 minutes. The corpus was created to advance in the field of noise robust and distant speech processing and is intended to serve as a public research and benchmarking data set.


CDLA-Permissive license.



Files (13.4 GB)

Name Size Download all
13.4 GB Download
340.4 kB Preview Download