OfficeDial Dataset
Authors/Creators
- 1. University of Illinois at Chicago
- 2. Arizona State University
Description
# OfficeDial Dataset
## EXPLANATION OF DATA FILES
We are releasing this dataset as a json file containing dialogues between a user and an IVA in different noise levels for different scenarios. The format of the dataset is adapted from [Taskmaster](https://github.com/google-research-datasets/Taskmaster) dataset.
The dataset is a dictionary of filenames and an array of conversations.
Each conversation contains the following attributes:
- conversation_id: a unique id
- scenario: scenario of this conversation, could be S1_A, S1_B, S2_A, S2_B, S3_A, S3_B
- noise: noise level played of during this conversation, values are SILENCE, NON_VERBAL, VERBAL
- utterances: an array of utterances
Each utterance contains the following fields:
- index: index representing the order of this conversation, starts at 0
- speaker: speaker of this specific line, values are USER, ASSISTANT
- text: The transcription of the spoken words
## License
Creative Commons Attribution License (cc-by).
Files
officedial_dataset.json
Files
(238.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:7836824e640554f5806362e3a17a2e87
|
237.4 kB | Preview Download |
|
md5:a55e0c2081a27e08ba1d00a3ee2746fd
|
959 Bytes | Preview Download |