Published August 20, 2023 | Version v1
Dataset Open

CAPTDURE: Captioned Sound Dataset of Single Sources

  • 1. Ritsumeikan University
  • 2. Doshisha University
  • 3. Hitachi, Ltd.

Description

Description

This is a dataset with captions for a single-source sound that can be used in various tasks that use environmental sounds. The dataset consists of 1,044 single-source sounds and 4,902 captions (3 or more captions per single-source sound). This dataset also consists of 1,044 multiple-source sounds and 3,132 captions (3 captions per multiple-source sound). The detail of the dataset is described in [1].

Conditions of use

This dataset was made by Hitachi, Ltd. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

Citation

If you use this dataset, please cite as follow:

[1] Yuki Okamoto, Kanta Shimonishi, Keisuke Imoto, Kota Dohi, Shota Horiguchi, and Yohei Kawaguchi, "CAPTDURE: Captioned sound Dataset of Single Sources," Proc. INTERSPEECH, pp. 1683-1687, 2023.

Feedback

If there is any problem, please contact us

Files

mixture_source_caption.csv

Files (1.8 GB)

Name Size Download all
md5:73bb19c9ef5fa65b52b0566cc0399d69
754.7 kB Preview Download
md5:8826b524acd3806a05c9d6db0b612fa9
1.4 MB Preview Download
md5:7d7e56e820c78b899be1da2af0f88bf8
1.2 GB Preview Download
md5:ca9245615aa06445642e914ea79ebe2d
808.1 kB Preview Download
md5:5804951ce3869cd0feb1b5d23a6b9ae6
2.3 MB Preview Download
md5:4f8e924525975149c4ea91a11722d2e0
585.1 MB Preview Download