MEG Audiovisual Matrix-Sentence Dataset (using Natural, Avatar, Degraded and Still-Image Version of the Speaker)
Authors/Creators
Description
Audiovisual Stimuli
We used German 5-word Matrix Sentences derived from the Oldenburger Satztest. We video-recorded a speaker speaking 571 (originally 600, after checking for pronunciation and disturbing sounds, 571 were left) different randomly generated sentences (frame rate: 29.97 fps, audio sampling rate: 44.1 kHz).
We then processed the video clips into 4 different audiovisual conditions:
Natural Video: The natural video of the speaker.
Degraded Video: A visually degraded version of the videos using FFmpeg’s edgedetect filter:
ffmpeg -i input.mp4 -vf "edgedetect=low=0.1:high=0.4" output.mp4
Avatar: An avatar of the speaker generated using software from the company D-iD, which used a CNN-based image encoder to process a still image of the talker and a GAN image-to-video model to animate lip movements in sync with the input audio (https://www.d-id.com/).
Still Image: A still image of the speaker combined with the audio track.
Experimental Design
Each participant took part in two measurement sessions. In both sessions, sentences with different visual stimuli were presented with a four-talker babbling noise at -4dB SNR. After each audiovisual sentence, the participants repeated what they had understood. After each visual-only sentence, the participants repeated the name they had lip-read. The sessions were structured as follows:
Session 1:
SRT50 measurement with 80 audio-only sentences. (Data not included due to storage limitations --> available upon request)
1. Audiovisual block: three random sentences of each av-condition in random order --> 12 sentences
1. Visual-only block: three random sentences of each v-only condition in random order --> 9 sentences
2. Audiovisual block
2. Visual-only block
…
8. Audiovisual block
8. Visual-only block
Session 2:
1. Audiovisual block
1. Visual-only block
…
12. Audiovisual block
12. Visual-only block
MEG and Behavioral Data Structure
MEG data of 32 participants is contained in this data set. Each participant has a directory “participants/px”. In the participant folder, you can find a "px_overview.csv" file and a folder with all the meg data “participants/px/meg_data”.
The overview file contains the sentence presentation order and the behavioral data. It is structured as follows:
- Column: numbers the sentences in the order they were presented in the measurement sessions (i). Sentences i = 0 – 167 were presented in session 1. Sentences i = 168 – 419 were presented in Session 2.
- Column: provides the visual condition of the presented sentence (natural, degraded, avatar, still_image).
- Column: lists the sentence id (1 – 571) of the presented sentence (corresponding to the id in the stimuli directories).
- Column: is 1 if the audio was audible for the participant (audiovisual stimuli) and 0 if it wasn’t (visual-only stimuli).
- Column: is 1 if the presented name was understood/lip-read correctly
- Column: is 1 if the presented verb was understood correctly
- Column: is 1 if the presented amount was understood correctly
- Column: is 1 if the presented adjective was understood correctly
- Column: is 1 if the presented subject was understood correctly
For visual-only stimuli, columns 6–9 are always 0, as participants only repeated the name.
The “participants/px/meg_data” directory contains an “i-raw.fif” file for each of the 420 presented sentences. The files can be loaded with the MNE-library as follows:
meg = mne.read_raw_fif(“…/1-raw.fif“)
#The data can be accessed:
meg_data = meg.get_data()
#The info file can be accessed:
meg_info = meg.info
The “participants/" directory additionally contains:
- “participants/participants_overview.csv": overview of the age and sex of the participants.
- "read_me.txt": containing information on individual missing sentences of individual participants
Stimuli Data Structure
The “stimuli” directory contains a folder for each audiovisual stimuli condition (avatar, degraded, natural, still_image). Additionally, the mp3 file of each sentence is in the folder “stimuli/mp3_files”. Each of the five folders contains a version of each sentence. Each stimulus/sentence file is identifiable by its sentence ID, which ranges from 1 to 571.
Technical Details
The meg data were recorded at the University Hospital in Erlangen, Germany. The system is a 248 magnetometer system (4D Neuroimaging, San Diego, CA, USA).
The video signal was presented via a beamer outside of the shielded chamber. The video was displayed on a screen above the participant via mirrors.
The Audio signal was transmitted via 2 m-long, 2 cm-diameter tubes, resulting in a 6 ms delay. The stimuli used in the experiment were corrected for this delay. The stimuli provided here are with original alignment (not corrected for the setup-specific 6ms delay).
The attended sentence and the babbling noise were presented on both ears diotically with a sound pressure level of 68 dB(A).
Processing of meg data
Three meg channels were removed from all measurement data as they were broken and show no signal. The data were analog-filtered from 1.0 to 200 Hz. It was offline-filtered using a notch filter (Firwin, 0.5 Hz bandwidth) at power-line frequencies (50, 100, 150, 200 Hz). The data were then resampled from 1017.25 Hz to 1000 Hz.
Alignment of audio and meg data
The meg data are cut into sentence-long snippets aligned with the mp3 files. Load a mp3 file and resample it to 1000 Hz. Then load a meg file corresponding to the same sentence id. The two loaded instances should now have the same shape.
You can use librosa to load and resample the mp3 file:
audio_data, sr = librosa.load(audio_path, sr=None)
audio_data = librosa.resample(audio_data, orig_sr=sr, target_sr=1000)
Paper to cite when using this data
Riegel et al., “Talking avatars can differentially modulate cortical speech tracking in the high and in the low delta band” (https://doi.org/10.64898/2026.01.07.695461)
Example Code
Example code on how to compute Temporal Response Functions and predictor variables is provided in a repository by Alina Schüller: https://github.com/Al2606/MEG-Analysis-Pipeline
Files
meg_data_and_stimuli.zip
Files
(49.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:22856bf0156e9562077131c668d66828
|
49.7 GB | Preview Download |
Additional details
Related works
- Is part of
- Preprint: 10.64898/2026.01.07.695461 (DOI)
Software
- Repository URL
- https://github.com/Al2606/MEG-Analysis-Pipeline
- Programming language
- Python