EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion Recognition in Task-Oriented Dialogue Systems

doi:10.5281/zenodo.6506504

Published April 29, 2022 | Version 1.0.0

Dataset Open

EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion Recognition in Task-Oriented Dialogue Systems

1. Heinrich Heine University Düsseldorf

This is the dataset created for the paper, "EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion Recognition in Task-Oriented Dialogue Systems" (https://arxiv.org/abs/2109.04919).

EmoWOZ is based on MultiWOZ, a multi-domain task-oriented dialogue dataset (https://github.com/budzianowski/multiwoz). It contains more than 11K task-oriented dialogues with more than 83K emotion annotations of user utterances. In addition to Wizard-of-Oz dialogues from MultiWOZ, we collect human-machine dialogues within the same set of domains to sufficiently cover the space of various emotions that can happen during the lifetime of a data-driven dialogue system. There are 7 emotion labels, which are adapted from the OCC emotion models.

For data format and label definition, please refer to README.md.

Notes

S. Feng, N. Lubis, M. Heck, and C. van Niekerk are supported by funding provided by the Alexander von Humboldt Foundation in the framework of the Sofja Kovalevskaja Award endowed by the Federal Ministry of Education and Research, while C. Geishauser and H-C. Lin are supported by funds from the European Research Council (ERC) provided under the Horizon 2020 research and innovation programme (Grant agreement No. STG2018 804636). Computing resources were provided by Google Cloud.

Files

data-split.json

Files (178.0 MB)

Name	Size	Download all
data-split.json md5:1490939e90c3c7b656e000bc212fd8fe	328.5 kB	Preview Download
emowoz-dialmage.json md5:9c6b7a91fa93851d4bee6f34947d07f5	18.1 MB	Preview Download
emowoz-multiwoz.json md5:8b06d935ec69dd21ba654848c9000293	159.6 MB	Preview Download
README.md md5:eafca80e565f4815f50570946e19f472	3.2 kB	Preview Download

Additional details

Is published in: Dataset: arXiv:2109.04919 (arXiv)

	All versions	This version
Views	1,666	1,001
Downloads	914	828
Data volume	70.9 GB	69.7 GB

EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion Recognition in Task-Oriented Dialogue Systems

Creators

Description

Notes

Files

data-split.json

Files (178.0 MB)

Additional details

Related works