Dataset Open Access

Thorsten-Voice - "Thorsten-21.02-neutral" Dataset

Müller, Thorsten; Kreutz, Dominik

Thorsten-Voice (Thorsten-21.02-neutral) is a neutrally spoken voice dataset recorded by Thorsten Müller, audio optimized by Dominik Kreutz and licenced under CC0 to provide it for anybody without any financial or licence struggle.

"I contribute my personal voice as a person believing in a world where all people are equal. No matter of gender, sexual orientation, religion, skin color and geocoordinates of birth location. A global world where everybody is warmly welcome on any place on this planet and open and free knowledge and education is available to everyone." (Thorsten Müller)

 

Dataset details:

  • ljspeech file and directory structure
  • 22.668 recorded phrases (wav files)
  • more than 23 hours of pure audio
  • samplerate 22.050Hz
  • mono
  • normalized to -24dB
  • phrase length (min/avg/max): 2 / 52 / 180 chars
  • no silence at beginning/ending
  • avg spoken chars per second: 14
  • sentences with question mark: 2.780
  • sentences with exclamation mark: 1.840

See more details on my Github page or Thorsten-Voice project website.

Please use it to make the world a better place for whole humankind.
Files (2.7 GB)
Name Size
thorsten-neutral_v03.tgz
md5:a6300c3dd07cde05e8e154f1d1f19de6
2.7 GB Download
1,895
3,370
views
downloads
All versions This version
Views 1,8951,895
Downloads 3,3703,370
Data volume 9.2 TB9.2 TB
Unique views 1,7211,721
Unique downloads 722722

Share

Cite as