Dataset Open Access

EEG and audio dataset for auditory attention decoding

Fuglsang, Søren A.; Wong, Daniel D.E.; Hjortkjær, Jens

This dataset contains EEG recordings from 18 subjects listening to one of two competing speech audio streams. Continuous speech in trials of ~50 sec. was presented to normal hearing listeners in simulated rooms with different degrees of reverberation. Subjects were asked to attend one of two spatially separated speakers (one male, one female) and ignore the other. Repeated trials with presentation of a single talker were also recorded. The data were recorded in a double-walled soundproof booth at the Technical University of Denmark (DTU) using a 64-channel Biosemi system and digitized at a sampling rate of 512 Hz. Full details can be found in:

  • Søren A. Fuglsang, Torsten Dau & Jens Hjortkjær (2017): Noise-robust cortical tracking of attended speech in real-life environments. NeuroImage, 156, 435-444


  • Daniel D.E. Wong, Søren A. Fuglsang, Jens Hjortkjær, Enea Ceolini, Malcolm Slaney & Alain de Cheveigné: A Comparison of Temporal Response Function Estimation Methods for Auditory Attention Decoding. bioRxiv 281345; doi: 10.1101/281345

The data is organized in format of the publicly available COCOHA Matlab Toolbox. The preproc_script.m demonstrates how to import and align the EEG and audio data. The script also demonstrates some EEG preprocessing steps as used the Wong et al. paper above. The contains wav-files with the speech audio used in the experiment. The contains MAT-files with the EEG/EOG data for each subject. The EEG/EOG data are found in data.eeg with the following channels:

  • channels 1-64: scalp EEG electrodes
  • channel 65: right mastoid electrode
  • channel 66: left mastoid electrode
  • channel 67: vertical EOG below right eye
  • channel 68: horizontal EOG right eye
  • channel 69: vertical EOG above right eye
  • channel 70: vertical EOG below left eye
  • channel 71: horizontal EOG left eye
  • channel 72: vertical EOG above left eye

The expinfo table contains information about experimental conditions, including what what speaker the listener was attending to in different trials. The expinfo table contains the following information:

  • attend_mf: attended speaker (1=male, 2=female)
  • attend_lr: spatial position of the attended speaker (1=left, 2=right)
  • acoustic_condition: type of acoustic room (1= anechoic, 2= mild reverberation, 3= high reverberation, see Fuglsang et al. for details)
  • n_speakers: number of speakers presented (1 or 2)
  • wavfile_male: name of presented audio wav-file for the male speaker
  • wavfile_female: name of presented audio wav-file for the female speaker (if any)
  • trigger: trigger event value for each trial also found in data.event.eeg.value contains the preprocessed EEG and audio data as output from preproc_script.m.

The dataset was created within the COCOHA Project: Cognitive Control of a Hearing Aid

Files (18.3 GB)
Name Size
385.1 MB Download
1.8 GB Download
16.0 GB Download
6.3 kB Download
All versions This version
Views 1,1851,185
Downloads 484484
Data volume 4.5 TB4.5 TB
Unique views 1,0731,073
Unique downloads 244244


Cite as