Published February 14, 2025 | Version v2

FSD50k in WebDataset Format

Authors/Creators

Description

This dataset is the FSD50K dataset, formatted in the WebDataset format. WebDataset files are essentially tar archives, where each example in the dataset is represented by a pair of files: a WAV audio file and a corresponding JSON metadata file. The JSON file contains the class label and other relevant information for that particular audio sample.

$ tar tvf fsdk50_eval_0000000.tar |head
-r--r--r-- bigdata/bigdata  40 2025-01-12 13:02 45604.json
-r--r--r-- bigdata/bigdata 43066 2025-01-12 13:02 45604.wav
-r--r--r-- bigdata/bigdata    46 2025-01-12 13:02 213293.json
-r--r--r-- bigdata/bigdata 1372242 2025-01-12 13:02 213293.wav
-r--r--r-- bigdata/bigdata      82 2025-01-12 13:02 348174.json
-r--r--r-- bigdata/bigdata  804280 2025-01-12 13:02 348174.wav
-r--r--r-- bigdata/bigdata      71 2025-01-12 13:02 417736.json
-r--r--r-- bigdata/bigdata 2238542 2025-01-12 13:02 417736.wav
-r--r--r-- bigdata/bigdata      43 2025-01-12 13:02 235555.json
-r--r--r-- bigdata/bigdata  542508 2025-01-12 13:02 235555.wav
 $ tar -xOf fsdk50_eval_0000000.tar 45604.json
{"soundevent": "Yell;Shout;Human_voice"}

Files

Files (34.7 GB)

Name Size
md5:6e4d0f79d1bc09f2fbe5a524ff0d30a0
8.7 GB Download
md5:2792c69397435773684b43a9610237d1
224.4 MB Download
md5:3cd86e95945deaeca867ca4d41dad73e
5.1 GB Download
md5:227fb26e74c690e6e00c5f6550713149
6.4 GB Download
md5:7fd9f80925148a069d25acda97a8ba9a
5.8 GB Download
md5:5c58aad5ef582a29db9ece05c1a6917a
5.3 GB Download
md5:e1d6ff1c42b91131bc81cf23cd9ab304
3.3 GB Download

Additional details

References

  • Fonseca, E., Favory, X., Pons, J., Font, F., & Serra, X. (2021). Fsd50k: an open dataset of human-labeled sound events. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 829-852.