Published May 29, 2024 | Version v1
Dataset Open

ESC-50-Voice: Dataset of vocal imitation for environmental sound in ESC-50

  • 1. ROR icon The University of Tokyo
  • 2. ROR icon Doshisha University
  • 3. ROR icon Keio University
  • 4. ROR icon Ritsumeikan University

Description

Description

This is a dataset with vocal imitation, which involve the process of replicating or mimicking the rhythm and pitch of sounds by voice for an environmental sound in ESC-50 [1] that can be used in various tasks that use environmental sounds. The dataset consists of 9,920 vocal imitations (8 imitators per environmental sound). Each imitator is a Japanese speaker. All audio data are 48kHz/16bit wav files. 

Each audio file is named as follows:

vocal_imitation/SpeakerID/FileName_SpeakerID.wav

FileName means the original audio file name in ESC-50. SpaekerID means the ID of the imitator. We recorded vocal imitations for a part of sound events in ESC-50. A list of the sound events used can be obtained from EventList.csv.

Note that this dataset does not contain environmental sound files, which can be obtained from ESC-50. Environmental sounds in ESC-50 are available here.

Terms of use

The materials may be used free of charge for research purposes, but please refrain from redistribution or use that is offensive to public order and morals. If you want to use for commercial purposes, please contact us (Yuki Okamoto or Keisuke Imoto).

Citation

If you use this dataset, please cite as follow:

Yuki Okamoto, Keisuke Imoto, Shinnosuke Takamichi, Ryotaro Nagase, Takahiro Fukumori, and Yoichi Yamashita, "Environmental Sound Synthesis from Vocal Imitations and Sound Event Labels," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 411-415, 2024.

Feedback

If there is any problem, please contact us

 

[1] K. J. Piczak, "Esc: Dataset for environmental sound classification,” in Proc. the 23rd ACM International Conference on Multimedia, 2015, p. 1015–1018.

Files

EventList.csv

Files (1.7 GB)

Name Size Download all
md5:3e4651cbc61c6b7733e5094b4d1ab0cb
452 Bytes Preview Download
md5:8b98ffa211a7e26a44c91ebc89de61ce
1.7 GB Download