Published September 5, 2023 | Version Initial version
Dataset Open

Interruption Audio & Transcript: Derived from Group Affect and Performance Dataset

  • 1. Imperial College London

Description

Licensing

This dataset is adapted from the Group Affect and Performance dataset which is released under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. https://creativecommons.org/licenses/by-nc/4.0/

 

Description

This dataset contains the audio files containing manually annotated cases of overlapped utterances, classified into True Interruptions and False Interruptions. It is derived from the Group Affect and Performance dataset created by the University of the Fraser Valley, Canada. Original conversation transcripts and audio files have been supplied for context. The Group Affect and Performance dataset provides a rich source of interruptions and overlapped utterances in general, yielding 200 True Interruptions from 355 instances of overlapped utterances in the 14 Group meetings which were annotated.

 

Structure

This dataset is structured into three parts:

1. data.json contains a list of all instances of overlapped utterances, classified into ‘interruption’ and ‘non-interruption’ corresponding to True and False Interruptions respectively. Each instance is uniquely identified by the Group in which it occurred, the speaker and the starting time of the utterance.

2. The 'audio' directory contains the audio of each instance of overlapped utterances corresponding to those found in data.json. The naming convention of the files is as such: ‘Group [group number]: [utterance start time] - [utterance end time].wav’.

3. Also included is a copy of the original dataset which includes the full audio and transcript. This allows the full meeting to be heard and any context for interruptions to be evaluated.

Note that directories 2. and 3. can be accessed by unzipping audio-and-transcripts.zip.

 

Data Collection Protocol

Of paramount importance to our process are the definitions of an overlapped utterance and a True Interruption. A False Interruption is simply an overlapped utterance which is not a True Interruption. These definitions directly impact the dataset; for overlapped utterance it informs which data points are included in our dataset and for True Interruption it informs the classes assigned to each sample.

In defining an overlapped utterance, our primary aim is to create an overarching class encompassing interruptions and all instances that could be deemed a True Interruption. For this reason, we omit cases where the timing misplaced speech and early-onset responses.

An overlapped utterance is defined as an instance where one interlocutor provides speech or noise during another interlocutor’s speech, creating an overlap that may be deemed a possible interruption when considering its timing alone. For this reason we omit cases of where the timing indicates misplaced speech or early-onset responses.

Our definition of True Interruption is an instance where an interrupting party intentionally attempts to take over a turn of the conversation from an interruptee and, in doing so, creates an overlap in speech.

As previously mentioned, due to the ‘intent’ part of this definition, we avoid cases of misplaced speech and early-onset responses. The former is enforced by not considering cases of overlapped speech which begin within 300ms of each other since this is an estimate for the average human reaction time of articulating a vowel in response to a speech stimuli. The latter is enforced by not considering speech starting within the last 10% of first utterance in the overlapped speech. Note that this approach fails to filter out all cases of misplaced speech, so we manually remove the remaining instances.

 

Methodology

Three main steps were taken to produce this dataset:

1. Parsing the transcripts for cases of overlapping speech

2. Manually annotating these cases per our protocol and adding them to data.json

3. Extracting audio samples from data.json and adding them to the audio folder

 

If you use this dataset, please cite the following paper:

 

Doyle, D.; Şerban, O. Interruption Audio & Transcript: Derived from Group Affect and Performance Dataset. Data 20249, 104. https://doi.org/10.3390/data9090104

 

@article{data9090104,

AUTHOR = {Doyle, Daniel and Şerban, Ovidiu},

TITLE = {Interruption Audio & Transcript: Derived from Group Affect and Performance Dataset},

JOURNAL = {Data},

VOLUME = {9},

YEAR = {2024},

NUMBER = {9},

ARTICLE-NUMBER = {104},

URL = {https://www.mdpi.com/2306-5729/9/9/104},

ISSN = {2306-5729},

DOI = {10.3390/data9090104}

}

Files

audio-and-transcripts.zip

Files (916.7 MB)

Name Size Download all
md5:ac4ee3b583462f812d867b8db9a23abe
916.7 MB Preview Download
md5:e8e03717a4e1e277e111cf0840370441
51.7 kB Preview Download
md5:cd7bd69e3e02d0d3a0c35a786a178967
17.5 kB Preview Download

Additional details

References

  • Braley M, Murray G. The group affect and performance (gap) corpus. In: Proceedings of the Group Interaction Frontiers in Technology; 2018. p. 1-9.