Four-speaker EEG dataset
Description
Please cite these original paper where this dataset was presented:
Yan, Y., Xu, X., Zhu, H., Tian, P., Ge, Z., Wu, X., Chen, J. “Auditory Attention Decoding in Four-Talker Environment with EEG”. in Interspeech 2024, Kos, Greece, Republic of: ISCA, Sep.2024, pp. 432-436
X. Xu, B.Wang, Y. Yan, X.Wu, and J. Chen, “A DenseNet-Based Method for Decoding Auditory Spatial Attention with EEG,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul, Korea, Republic of: IEEE, Apr. 2024, pp. 1946–1950.
In our code repository https://github.com/xuxiran/AAD_4direction_code.git , we have provided the envelopes of the experimental stimuli. If you require the original speech stimuli for auditory attention decoding (AAD) tasks, please feel free to contact us via email 2301111611@stu.pku.edu.cn and provide a brief explanation of your research needs.
The details of the setup for the experiment was mentioned in the original paper.
Methods
Participants
Sixteen university students(age range: 19–26 years) with normal hearing took part in the experiment, whose whose audiometric thresholds were less than 20 dB hearing level at frequencies from 250 Hz to 8000 Hz for both ears. All of the subjects were Mandarin-native speakers and right-handed, and ten of them were male. Before the experiments, the subjects were informed about the procedure and the objectives of the experiment. The design of the experiment was planned in accordance with ethics guidelines and was approved by the Peking University Institutional Review Board.
Stimuli and experimental procedure
The audio materials used in the experiment were proposed in the previous study [1]. The speech corpus was selected from the book (Chinese translation) Twenty Thousand Leagues under the Sea by Jules Verne. Two speakers (one female and one male, standard-Mandarin speakers) narrated twenty different segments, and each lasted for more than 60s. The mean F0 was about 207 Hz and 124 Hz for the two speakers. The other two speakers’ speeches were produced by shifting up the F0 of the original speech with the speech synthesis technology of Adobe Audition software, resulting in a mean F0 of 230 Hz and 136 Hz for the female and the male speech, respectively. All speeches were clipped into consecutive 60s segments starting from the speech onset. In the experiment, 40 speech combinations were used, with each consisting of speech segment of one female, one male, one female with a higher F0 and one male with a higher F0. All the audio was recorded at a sampling rate of 48 kHz.
The schematic of the acoustic environment used for this study was similar to previous work [2], as shown in Figure 1. The experiments were conducted on an anechoic chamber, dimensions of which were 6.5 m*4.8 m*3.2 m [3], with a cutoff frequency of 70 Hz. A circular acoustic free-field system, consisting of four loudspeakers (Dynaudio BM 6A) with a radius of 1.6m, was established inside the room. The loudspeakers were symmetrically positioned in a semicircle at equal angular distances, with locations at +30° LS1), -30°(LS2), +90°(LS3), and -90°(LS4), aligned with the height of the listener’s ears.
Figure 1: Setup of the stimulus presentation and the data acquisition.
In the experiment, subjects sat right in front of a desk with their head fixed by a chinrest in the center of the circle. A monitor was on the desk to present subjects’ visual feedback and relevant instructions. Before the stimulus presentation, subjects were cued to attended to the speech presented by a loudspeaker with a certain direction and ignore the other three. During the presentation, the speeches of the four talkers in a speech combination were presented through the four loudspeakers at 55 dBLAeq, respectively. For each subject, 10 combinations out of the 40 were assigned to each space condition, and each combination was presented once with different attended talker. Thus, each subject undertook 40 trials in total.
[1] Z. Fu, X. Wu, and J. Chen, “Congruent audiovisual speech enhances auditory attention decoding with EEG,” Journal of Neural Engineering, vol. 16, no. 6, p. 066033, Nov. 2019.
[2] P. J. Sch ̈afer, F. I. Corona-Strauss, R. Hannemann, S. A. Hillyard, and D. J. Strauss, “Testing the Limits of the Stimulus Reconstruction Approach: Auditory Attention Decoding in a Four-Speaker Free Field Environment,” Trends in Hearing, vol. 22, p. 233121651881660, Jan. 2018.
[3] T. Qu, Z. Xiao, M. Gong, Y. Huang, X. Li, and X. Wu, “Distance-Dependent Head-Related Transfer Functions Measured With High Spatial Resolution Using a Spark Gap,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1124–1132, Aug. 2009.
Files
EEG_raw.zip
Files
(8.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c6ae641321461975345ef904c0636800
|
8.8 GB | Preview Download |