Larynx Microphone Singer-Songwriter Dataset

Schwär, Simon; Krause, Michael; Fast, Michael; Rosenzweig, Sebastian; Scherbaum, Frank; Mueller, Meinard

doi:10.5281/zenodo.20287765

Published February 23, 2024 | Version 1.0

Dataset Open

Larynx Microphone Singer-Songwriter Dataset

1. International Audio Laboratories Erlangen
2. Audoo Ltd.
3. University of Potsdam

Larynx microphones (LMs) provide a practical way to obtain crosstalk-free recordings of the human voice by picking up vibrations directly from the throat. This can be useful in a multitude of music information retrieval scenarios related to singing, e.g., the analysis of individual voices recorded in environments with lots of interfering noise. However, LMs have a limited frequency range and barely capture the effects of the vocal tract, which makes the recorded signal unsuitable for downstream tasks that require high-quality recordings. In this paper, we introduce the task of reconstructing a natural sounding, high-quality singing voice recording from an LM signal. With an explicit focus on the singing voice, the problem lies at the intersection of speech enhancement and singing voice synthesis with the additional requirement of faithful reproduction of expressive parameters like dynamics and intonation. To facilitate research in this area, we publish a dataset with over 3.5 hours of popular music we recorded with four amateur singers accompanied by a guitar, where both LM and clean close-up microphone signals are available.

The dataset is part of the following publication:

Simon Schwär, Michael Krause, Michael Fast, Sebastian Rosenzweig, Frank Scherbaum, and Meinard Müller
A Dataset of Larynx Microphone Recordings for Singing Voice Reconstruction
Transaction of the International Society for Music Information Retrieval (TISMIR), 7(1): 30–43, 2024.

Dataset components:

Multi-track audio recordings (Vocals close-up, vocals larynx microphone, stereo guitar microphone, guitar pickup)
Two reference mixes (MixA: mix of GL, GR, and CM signals without effects, MixB: same signals as MixA, with equalization, compression, reverb, and saturation)
Lyrics for each song

Dataset file naming conventions:

The audio folder contains 348 audio files in total.
File naming scheme: SSD[UID]_[Song ID]_[Signal Type]_[Crosstalk]_[Singer ID]_[Take ID].wav

UID: unique numerical identifier for a take between 001 and 072
Song ID: two-letter identifier of the current song
Signal Type: CM: close-up microphone, LM: larynx microphone, GL: guitar left, GR: guitar right, GP: guitar pickup, MixA: stereo mix without effects, MixB: stereo mix with effects (equalization, compression, reverb, saturation)
Crosstalk: C1: guitar crosstalk present on CM signal(s), C0: no guitar crosstalk
Singer ID: identifier for the singer (1M, 2M, 3F, or 4F)
Take ID: take identifier between T1 and T6 (T1-3 use LM-A, T4-6 use LM-B)

The lyrics folder contains lyrics as sung for each song.
File naming scheme: [Song ID].txt

Larynx Microphone Types:

LM-A: Albrecht AE-38-S2a
LM-B: self-built using TE Connectivity CM-01B contact microphones

Files

lm-ssd_v1.zip

Files (5.9 GB)

Name	Size
lm-ssd_v1.zip md5:b3e3ef6646e452a1bb27c928ce1aa11c	5.9 GB	Preview Download

Additional details

DOI: 10.5334/tismir.166

Is described by: Journal article: 10.5334/tismir.166 (DOI)

Deutsche Forschungsgemeinschaft
Computational Analysis of Georgian Vocal Music and Beyond (MU 2686/13-2) 401198673

	All versions	This version
Views	91	91
Downloads	17	17
Data volume	136.7 GB	136.7 GB

lm-ssd_v1.zip

Files (5.9 GB)

Identifiers

Related works

Funding

Larynx Microphone Singer-Songwriter Dataset

Authors/Creators

Description

Files

lm-ssd_v1.zip

Files (5.9 GB)

Additional details

Identifiers

Related works

Funding