Published March 20, 2024 | Version v1.0
Dataset Open

German own voice recordings with hearable microphones

  • 1. ROR icon Fraunhofer Institute for Digital Media Technology
  • 2. ROR icon Carl von Ossietzky University of Oldenburg

Description

This dataset is supplementary material to the article "Modeling of Speech-dependent Own Voice Transfer Characteristics for Hearables with an In-ear Microphone" published in Acta Acustica, vol. 8 (2024).

The dataset consists of recordings of own voice speech of 18 talkers (5 female, 13 male) wearing hearable devices in both ears. All talkers were native German speakers. The dataset was recorded in a sound-proof listening booth using the Hearpiece prototype device (closed vent variant) [1].

The speech uttered by the talkers is pre-determined text read from a screen. The talkers press a button on the screen to start the recording, read the sentence out loud in a normal voice, and press another button to stop the recording. It was possible for the talkers to re-record a sentence if desired. The sentences read by the talkers originate from the following sources:

  • The north wind and the sun (German), 6 sentences
  • Berlin and Marburg Sentences [2] (German), 2x100 sentences
  • 100 sentences for language learners [3] (German), 100 sentences
  • Some held-out vowels and consonants
  • [some seconds of silence]

The full text read by each talker is written in full_text.txt.

The recordings are contained in the folder speech. Each subfolder contains recordings from a different talker (e.g., VP_01). The sentences uttered by each talker are numbered following this scheme: VP_01_0.wav to VP_01_313.wav. Talkers where the device could not be inserted, or where the fit did not provide sufficient attenuation of external sounds to the in-ear microphone, were excluded.

A DPA 6060 lavalier clip microphone and a Tbone SC140 cardiod microphone were recorded as reference signals. From two Hearpiece devices (closed vent), the concha and in-ear microphones were recorded. Audio was recorded at a sampling frequency of 44100 Hz.

The channels of the recordings, counting from 0, recorded the following microphones:

  • 0: Lavalier-microphone clipped to the shirt neck, shirt collar etc. of the talker
  • 1: Reference microphone about 50 cm in front of the talker
  • 2: Left in-ear microphone Hearpiece
  • 3: Left concha microphone Hearpiece
  • 4: Right in-ear microphone Hearpiece
  • 5: Right concha microphone Hearpiece

[1] F. Denk, M. Lettau, H. Schepker, S. Doclo, R. Roden, M. Blau, J.-H. Bach, J. Wellmann, and B. Kollmeier: "A One-Size-Fits-All Earpiece with Multiple Microphones and Drivers for Hearing Device Research". In: Proc. AES International Conference on Headphone Technology. San Francisco, USA, Aug. 2019.

[2] A. P. Simpson, K. J. Kohler, and T. Rettstadt. "The Kiel Corpus of Read/Spontaneous Speech: Acoustic Data Base, Processing Tools, and Analysis Results". In: Arbeitsberichte Institut für Phonetik Und Digitale Sprachverarbeitung Universität Kiel. Vol. 32. IPDS, Nov. 1997, pp. 243-247.

[3] A. Neustein. "100 Sätze Reichen Für Ein Ganzes Leben" (Blog-post). https://deutschlernerblog.de/100-saetze-reichen-fuer-ein-ganzes-leben/. Aug. 2019.

 

 

Notes

The Oldenburg Branch for Hearing, Speech and Audio Technology HSA is funded in the program »Vorab« by the Lower Saxony Ministry of Science and Culture (MWK) and the Volkswagen Foundation for its further development. This work was partly funded by the German Ministry of Science and Education BMBF FK 16SV8811 and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - Project ID 352015383 - SFB 1330 C1.

Files

full_text.txt

Files (4.0 GB)

Name Size Download all
md5:d2b373e387746fec734012c006439ed4
11.6 kB Preview Download
md5:7c46b4d37319142ce52c1aa02f0136fe
4.0 GB Preview Download

Additional details

Related works

Is supplement to
Conference proceeding: 10.1109/ICASSP48485.2024.10447066 (DOI)
Preprint: 10.48550/arXiv.2310.06554 (DOI)
Journal article: 10.1051/aacus/2024032 (DOI)