Published December 5, 2023 | Version 1.1
Journal article Open

The Weights Task Dataset: A Multimodal Dataset of Collaboration in a Situated Task

Description

The Weights Task Dataset (WTD) is a novel dataset of a situated, shared collaborative task, annotated to encode many cross-cutting aspects of the situated and embodied involvement of the participants in joint activity.  The WTD consists of 10 videos. The 10 videos (stored as .mp4 files) total approximately 170 minutes. By convention, participants are identified numerically from left (P1) to right
(P3). 

The Weights Task is currently completed in groups of 3 at a round table in a laboratory setting. The setup includes a webcam that captures the task equipment on the round table as well as the faces and upper-bodies of all participants around the table. Kinect Azure cameras are stationed around the room to capture RGB and depth video of all participants from different angles. The equipment on the round table includes a set of 6 blocks (of varying weight, size, and color), a balance scale, a worksheet with weights in grams with spaces to place the blocks, and a computer with a survey where participants submit their responses to questions throughout the three parts of the task.

Participants are first given a balance scale to determine the weights of five colorful wooden blocks. They are told that one block weighs 10 grams, but that they have to determine the weights of the rest of the blocks using the balance scale. As the weight of each remaining block is discovered, the participants place the block on a worksheet next to its corresponding weight they also must submit their final answer for the weight of each block in the online survey form on the computer. In the second part of the activity, participants are given a new mystery block and must identifying the weight of the new block without using the scale (i.e., participants have to deduce the weight based on the pattern observed in the initial block weights). Groups are given two chances to submit the weight they agree upon within the online survey. If their first guess is incorrect, the survey provides a hint: consider the pattern in the weights of the prior blocks. Finally, participants are asked to determine the weight of another mystery block that is not physically present and explain how they determined the weight of the block. The participants once again submit their answer as a group in the online survey and are given two chances (with a hint after the first guess if it is incorrect).

Participant gestures are annotated using the GAMR framework. Within WTD, most gestures performed are deictic, indicating reference to an object or a location. There are some iconic gestures, which represent attributes of an action or object. Emblematic gestures have meaning that is set by cultural convention, rather than any physical or metaphorical similarity to their content. GAMR was dual annotated and the annotations produced a SMATCH F1-score of 0.75.

The NICE coding scheme captures nonverbal behaviors when people are working together in groups, such as the direction of gaze, posture (e.g., leaning toward or away from the activity area), and usage of tools (including pointing at or to the tool, as well as directly manipulating it). NICE was annotated by an expert over Groups 1-3 and Group 5.

Collaborative problem solving (CPS) coding is performed at the utterance level. Annotators watched the video and coded each utterance with potentially multiple labels based on content, context, and position in the conversational sequence. As with the GAMR annotations, videos were annotated for CPS by two annotators (kappa = 0.62) and adjudicated by an expert.

Audio from all groups were segmented into utterances and transcribed by human transcribers ("oracle''), by Google's Voice Activity Detector (VAD) and Cloud ASR, and by OpenAI's Whisper model.

The raw depth data from the Kinects has been processed into numerical skeleton data representing the orientation of each bone on each body in each frame.

Notes

This work was supported in part by the United States National Science Foundation (NSF) under grant #DRL 2019805 to the University of Colorado, with subcontracts to Colorado State University, Brandeis University, and the University of Wisconsin — Madison.

Files

Weights Task Dataset.zip

Files (24.2 GB)

Name Size Download all
md5:91e2b49275d100eff288eb7f56cf9a37
24.2 GB Preview Download

Additional details

Funding

AI Institute: Institute for Student-AI Teaming 2019805
National Science Foundation