SMART Challenge Series: Context-Aware Student Engagement Detection

Sharma, Gulshan; Li, Jialin; Salam, Hanan

doi:10.5281/zenodo.19322996

Published March 30, 2026 | Version v1

Dataset Restricted

SMART Challenge Series: Context-Aware Student Engagement Detection

1. New York University Abu Dhabi

Context-Aware Student Engagement Detection (CASED) Dataset

Welcome to the first iteration of the SMART Challenge Series hosted by the Social Machines and Robotics (SMART) Lab, New York University Abu Dhabi, in collaboration with colleagues from The University of Queensland, Istanbul Technical University, and Utrecht University.

This year’s challenge focuses on Context-Aware Student Engagement Detection (CASED) and will be hosted as part of the 28th ACM International Conference on Multimodal Interaction (ICMI 2026) Grand Challenges.

We invite researchers to explore multimodal and context-aware approaches for understanding student engagement in online classroom environments.

Challenge Overview

The CASED challenge invites researchers to explore context-aware approaches for understanding student engagement in real-world online classroom environments. Participants will develop models for two tracks:

Track 1 (Regression): Predict continuous engagement values.
Track 2 (Classification): Predict binary engagement labels.

Dataset Details

The dataset was collected by conducting online lectures on Artificial Intelligence and Mathematics via Zoom in in-the-wild settings. It contains a total of 8,472 video clips with participant-independent training and test splits:

Training set: 4,978 clips
Test set: 3,494 clips

Each data sample corresponds to a 10-second video clip. Personality metadata for both the student and the instructor is provided for the training set only.

Dataset Structure & Video Views

The dataset is split into development_data (containing videos, labels, and metadata for training) and evaluation_data (containing only the test videos). We provide two distinct video views for each sample:

student_only: The isolated webcam feed of the student. Resized to 224 x 224 resolution using letterbox padding.
video_cascade: A composite spatial layout providing full classroom context at 1280 x 1280 resolution (with letterbox padding) and synchronized audio. The layout is structured as follows:

Top: Student video stream
Lower Left: Instructor video stream
Bottom Right: Screen-shared lecture content stream

Dataset Directory Structure

CASED_Dataset/

│

├── README.pdf

├── submission_template.csv # A sample submission template

│

├── development_data/

│ ├── student_only/

│ │ └── train/ # .mp4 video files (Student isolated, 224x224)

│ ├── video_cascade/

│ │ └── train/ # .mp4 video files (Composite context, 1280x1280)

│ └── train.csv # Ground truth labels for training

│

├── evaluation_data/

│ ├── student_only/

│ │ └── test/ # .mp4 video files (Student isolated, 224x224)

│ └── video_cascade/

│ └── test/ # .mp4 video files (Composite context, 1280x1280)

│

└── metadata_optional/

├── personality_train.csv # Student personality metadata (Training set only)

└── instructor_personality.csv # Instructor personality metadata (Training set only)

*(Note: The evaluation labels are hidden and held securely on Codabench)*

Filename Structure

To map the video files to the personality metadata, participants must parse the video filenames. The filenames follow a consistent structure that contains both the instructor and student names.

Example filename: 05082021_Catherine_1_cardin_100.mp4

Instructor Name: Catherine (Located after the date)
Student Name: cardin (Located before the final clip index)

Label Format (train.csv)

The train.csv file contains the ground truth for both challenge tracks. The columns are structured as follows:

video_title: The exact filename of the 10-second video clip (matches identically across both view folders).
value: The continuous engagement score (Target for Track 1 - Regression).
label: The binary engagement classification (Target for Track 2 - Classification).

0 = Engaged
1 = Disengaged

Personality Metadata (optional)

Personality trait scores are provided for the training set via two files. The 10 personality dimensions measured are: Introversion, Trust, Emotional Stability, Low Conscientiousness, Sociability, Aesthetic Sensitivity, Criticalness, Conscientiousness, Neuroticism, and Creativity.

Student Personality (personality_train.csv)

student_name: The identifier linking the metadata to the student in the video clip (parsed from the filename as explained above).
Contains the 10 personality dimension scores for the students.

Instructor Personality (instructor_personality.csv)

Instructor Name: The identifier linking the metadata to the instructor in the video clip (parsed from the filename as explained above).
Contains the 10 personality dimension scores for the instructors.

Files

Restricted

The record is publicly accessible, but files are restricted. <a href="https://zenodo.org/account/settings/login?next=https://zenodo.org/records/19322996">Log in</a> to check if you have access.

Request access

If you would like to request access to these files, please fill out the form below.

***CASED Dataset Limited Use Agreement***

The dataset provided for the Context-Aware Student Engagement Detection (CASED) Challenge is released under restricted access and is available only to registered participants of the challenge.

By registering for this challenge, participants agree to the following terms:

The dataset will be used solely for participation in the CASED Challenge and for academic research purposes directly related to the challenge.

The dataset may not be redistributed, shared, copied, or made publicly available in whole or in part to any third party.

Any publication or additional research using the dataset after the conclusion of the competition requires prior written approval from the dataset authors.

Participants agree to acknowledge the dataset and cite the official dataset/challenge paper (to be released later) in any approved publications.

By registering for the CASED Challenge and accessing the dataset, participants confirm that they have read and agree to comply with the terms of this Dataset Limited Use Agreement.

***Warranty and Liability Disclaimer***

The dataset is provided “as is” without any warranty of any kind, express or implied, including but not limited to warranties of accuracy, completeness, fitness for a particular purpose, or non-infringement. The dataset authors and organizers shall not be liable for any damages or losses arising from the use of the dataset or participation in the challenge.

You are currently not logged in. Do you have an account? Log in here

	All versions	This version
Views	346	346
Downloads	293	293
Data volume	556.6 GB	556.6 GB

SMART Challenge Series: Context-Aware Student Engagement Detection

Authors/Creators

Description

Context-Aware Student Engagement Detection (CASED) Dataset

Challenge Overview

Dataset Details

Dataset Structure & Video Views

Dataset Directory Structure

Filename Structure

Label Format (train.csv)

Personality Metadata (optional)

Files

Restricted

Request access