Published March 2, 2015 | Version v1
Dataset Open

Data Corpus for the IEEE-AASP Challenge on the Acoustic Characterization of Environments (ACE)

  • 1. Imperial College London, UK
  • 2. Pindrop Inc., London, UK

Description

The aim of this challenge was to evaluate state-of-the-art algorithms for blind acoustic parameter estimation from speech and to promote the emerging area of research in this field.

Several established parameters and metrics have been used to characterize the acoustics of a room. The most important are the Direct-To-Reverberant Ratio (DRR), the Reverberation Time (T60) and the reflection coefficient. The acoustic characteristics of a room based on such parameters can be used to predict the quality and intelligibility of speech signals in that room. Recently, several important methods in speech enhancement and speech recognition have been developed that show an increase in performance compared to the predecessors but do require knowledge of one or more fundamental acoustical parameters such as the T60. Traditionally, these parameters have been estimated using carefully measured Acoustic Impulse Responses (AIRs). However, in most applications it is not practical or even possible to measure the acoustic impulse response. Consequently, there is increasing research activity in the estimation of such parameters directly from speech and audio signals.

Documentation and software

  • Corpus instructions including software operating instructions
  • Software to generate new datasets from the corpus materials (Matlab)
  • T60 and DRR measurements in fullband and ISO-266 preferred frequency bands
  • Room dimensions and approximate positions of microphones and sources

Anechoic speech

Comprising Development (Dev): 4 male talkers, 2 utterances each, and Evaluation (Eval): 5 male and 5 female talkers, 5 utterances each, recorded using the anechoic chamber at TU Delft at fs=48 kHz in 16-bit format. Plain text (.txt) transcriptions of each .wav file are included.

RIRs and noise by microphone configuration

Each archive below contains the set of fs=48 kHz 16-bit RIRs, ambient, fan and babble noise .wav files for each room and microphone position for that microphone configuration, recorded in 7 different rooms in the Dept. of Electrical and Electronic Engineering at Imperial College London.

The corpus comprises the following components:

  • Single-channel (based on cruciform channel 1) 417 MB
  • 2-channel laptop 1.05 GB
  • 3-channel mobile 1.59 GB
  • 5-channel cruciform 2.84 GB
  • 8-channel linear 4.24 GB
  • 32-channel spherical 14.2 GB

The corpus and the ACE Challenge are described in the following journal paper:

Please cite this whenever you use any part of the corpus. BibTeX references are available here for the journal paper and technical report.

Files

ACE_Corpus_instructions_v01.pdf

Files (24.5 GB)

Name Size Download all
md5:9d64e89b0e566022e00dfa7de9a5d7b7
1.2 MB Download
md5:7069d371c19a6812cea27f553864cbfd
77.6 kB Preview Download
md5:6e78f8a1b218dd2300ae06b1fb8710ad
3.4 MB Preview Download
md5:46b059e5585225bcae33dc112d031c72
1.1 GB Download
md5:c7b6feabd9dc9b7ec2d4570c18895158
2.8 GB Download
md5:f3f5c05863159f833ab5e99c91f2fb1b
14.2 GB Download
md5:5686f3836ec061b2e903e8ffb566f2ec
4.2 GB Download
md5:948e8def280c7282434eb7742f2b0737
1.6 GB Download
md5:d365026b3974a198ef58e8d3e647a415
417.2 MB Download
md5:07adaade86251cea9896708ec90d31c1
608.8 kB Download
md5:7cbfcf079dd44a67391b02b7c1696515
148.2 MB Download
md5:748b59d992db9c811c7c1ed7244403cc
374 Bytes Download