Published January 23, 2025
| Version v1
Dataset
Open
fluentspeechcommands in WebDataset Format
Creators
Description
This dataset is the fluentspeechcommands dataset, formatted in the WebDataset format. WebDataset files are essentially tar archives, where each example in the dataset is represented by a pair of files: a WAV audio file and a corresponding JSON metadata file. The JSON file contains the class label and other relevant information for that particular audio sample.
$ tar tvf fluentspeechcommands_train_0000000.tar |head
-r--r--r-- bigdata/bigdata 174 2025-01-17 07:20 48fac300-45c8-11e9-8ec0-7bf21d1cfe30.json
-r--r--r-- bigdata/bigdata 131116 2025-01-17 07:20 48fac300-45c8-11e9-8ec0-7bf21d1cfe30.wav
-r--r--r-- bigdata/bigdata 136 2025-01-17 07:20 3f770360-44e3-11e9-bb82-bdba769643e7.json
-r--r--r-- bigdata/bigdata 71376 2025-01-17 07:20 3f770360-44e3-11e9-bb82-bdba769643e7.wav
-r--r--r-- bigdata/bigdata 132 2025-01-17 07:20 3ea38ea0-4613-11e9-bc65-55b32b211b66.json
-r--r--r-- bigdata/bigdata 68310 2025-01-17 07:20 3ea38ea0-4613-11e9-bc65-55b32b211b66.wav
-r--r--r-- bigdata/bigdata 143 2025-01-17 07:20 61578420-45ea-11e9-b578-494a5b19ab8b.json
-r--r--r-- bigdata/bigdata 89208 2025-01-17 07:20 61578420-45ea-11e9-b578-494a5b19ab8b.wav
-r--r--r-- bigdata/bigdata 132 2025-01-17 07:20 c4595690-4520-11e9-a843-8db76f4b5e29.json
-r--r--r-- bigdata/bigdata 76502 2025-01-17 07:20 c4595690-4520-11e9-a843-8db76f4b5e29.wav
$ cat 48fac300-45c8-11e9-8ec0-7bf21d1cfe30.json
{"speakerId": "52XVOeXMXYuaElyw", "transcription": "I need to practice my English. Switch the language", "action": "change language", "object": "English", "location": "none"}