Published February 14, 2025 | Version v2
Dataset Open

FSD50k in WebDataset Format

Creators

Description

This dataset is the FSD50K dataset, formatted in the WebDataset format. WebDataset files are essentially tar archives, where each example in the dataset is represented by a pair of files: a WAV audio file and a corresponding JSON metadata file. The JSON file contains the class label and other relevant information for that particular audio sample.

$ tar tvf fsdk50_eval_0000000.tar |head
-r--r--r-- bigdata/bigdata  40 2025-01-12 13:02 45604.json
-r--r--r-- bigdata/bigdata 43066 2025-01-12 13:02 45604.wav
-r--r--r-- bigdata/bigdata    46 2025-01-12 13:02 213293.json
-r--r--r-- bigdata/bigdata 1372242 2025-01-12 13:02 213293.wav
-r--r--r-- bigdata/bigdata      82 2025-01-12 13:02 348174.json
-r--r--r-- bigdata/bigdata  804280 2025-01-12 13:02 348174.wav
-r--r--r-- bigdata/bigdata      71 2025-01-12 13:02 417736.json
-r--r--r-- bigdata/bigdata 2238542 2025-01-12 13:02 417736.wav
-r--r--r-- bigdata/bigdata      43 2025-01-12 13:02 235555.json
-r--r--r-- bigdata/bigdata  542508 2025-01-12 13:02 235555.wav
 $ tar -xOf fsdk50_eval_0000000.tar 45604.json
{"soundevent": "Yell;Shout;Human_voice"}

Files

Files (34.7 GB)

Name Size Download all
md5:6e4d0f79d1bc09f2fbe5a524ff0d30a0
8.7 GB Download
md5:2792c69397435773684b43a9610237d1
224.4 MB Download
md5:3cd86e95945deaeca867ca4d41dad73e
5.1 GB Download
md5:227fb26e74c690e6e00c5f6550713149
6.4 GB Download
md5:7fd9f80925148a069d25acda97a8ba9a
5.8 GB Download
md5:5c58aad5ef582a29db9ece05c1a6917a
5.3 GB Download
md5:e1d6ff1c42b91131bc81cf23cd9ab304
3.3 GB Download

Additional details

References

  • Fonseca, E., Favory, X., Pons, J., Font, F., & Serra, X. (2021). Fsd50k: an open dataset of human-labeled sound events. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 829-852.