Published January 23, 2025 | Version v1
Dataset Open

fluentspeechcommands in WebDataset Format

Creators

Description

This dataset is the fluentspeechcommands dataset, formatted in the WebDataset format. WebDataset files are essentially tar archives, where each example in the dataset is represented by a pair of files: a WAV audio file and a corresponding JSON metadata file. The JSON file contains the class label and other relevant information for that particular audio sample.

$ tar tvf fluentspeechcommands_train_0000000.tar |head
-r--r--r-- bigdata/bigdata 174 2025-01-17 07:20 48fac300-45c8-11e9-8ec0-7bf21d1cfe30.json
-r--r--r-- bigdata/bigdata 131116 2025-01-17 07:20 48fac300-45c8-11e9-8ec0-7bf21d1cfe30.wav
-r--r--r-- bigdata/bigdata    136 2025-01-17 07:20 3f770360-44e3-11e9-bb82-bdba769643e7.json
-r--r--r-- bigdata/bigdata  71376 2025-01-17 07:20 3f770360-44e3-11e9-bb82-bdba769643e7.wav
-r--r--r-- bigdata/bigdata    132 2025-01-17 07:20 3ea38ea0-4613-11e9-bc65-55b32b211b66.json
-r--r--r-- bigdata/bigdata  68310 2025-01-17 07:20 3ea38ea0-4613-11e9-bc65-55b32b211b66.wav
-r--r--r-- bigdata/bigdata    143 2025-01-17 07:20 61578420-45ea-11e9-b578-494a5b19ab8b.json
-r--r--r-- bigdata/bigdata  89208 2025-01-17 07:20 61578420-45ea-11e9-b578-494a5b19ab8b.wav
-r--r--r-- bigdata/bigdata    132 2025-01-17 07:20 c4595690-4520-11e9-a843-8db76f4b5e29.json
-r--r--r-- bigdata/bigdata  76502 2025-01-17 07:20 c4595690-4520-11e9-a843-8db76f4b5e29.wav

 

$ cat 48fac300-45c8-11e9-8ec0-7bf21d1cfe30.json 
{"speakerId": "52XVOeXMXYuaElyw", "transcription": "I need to practice my English. Switch the language", "action": "change language", "object": "English", "location": "none"}

Files

Files (2.3 GB)

Name Size Download all
md5:8b8501c06a38e9ca1598011b665e43d5
311.7 MB Download
md5:ba7231b49f2da899d3c8b20aac737067
746.6 MB Download
md5:5337a90fc956b24ffb7ff346ea0c7937
780.9 MB Download
md5:77fb922d10fa0adbee29a2c52baa4da7
258.2 MB Download
md5:7f50b2c8933158811e95ad09781607bc
236.8 MB Download