SMART Challenge Series: Context-Aware Student Engagement Detection
Description
Context-Aware Student Engagement Detection (CASED) Dataset
Welcome to the first iteration of the SMART Challenge Series hosted by the Social Machines and Robotics (SMART) Lab, New York University Abu Dhabi, in collaboration with colleagues from The University of Queensland, Istanbul Technical University, and Utrecht University.
This year’s challenge focuses on Context-Aware Student Engagement Detection (CASED) and will be hosted as part of the 28th ACM International Conference on Multimodal Interaction (ICMI 2026) Grand Challenges.
We invite researchers to explore multimodal and context-aware approaches for understanding student engagement in online classroom environments.
Challenge Overview
The CASED challenge invites researchers to explore context-aware approaches for understanding student engagement in real-world online classroom environments. Participants will develop models for two tracks:
-
Track 1 (Regression): Predict continuous engagement values.
-
Track 2 (Classification): Predict binary engagement labels.
Dataset Details
The dataset was collected by conducting online lectures on Artificial Intelligence and Mathematics via Zoom in in-the-wild settings. It contains a total of 8,472 video clips with participant-independent training and test splits:
-
Training set: 4,978 clips
-
Test set: 3,494 clips
Each data sample corresponds to a 10-second video clip. Personality metadata for both the student and the instructor is provided for the training set only.
Dataset Structure & Video Views
The dataset is split into development_data (containing videos, labels, and metadata for training) and evaluation_data (containing only the test videos). We provide two distinct video views for each sample:
-
student_only: The isolated webcam feed of the student. Resized to 224 x 224 resolution using letterbox padding.
-
video_cascade: A composite spatial layout providing full classroom context at 1280 x 1280 resolution (with letterbox padding) and synchronized audio. The layout is structured as follows:
-
Top: Student video stream
-
Lower Left: Instructor video stream
-
Bottom Right: Screen-shared lecture content stream
Dataset Directory Structure
CASED_Dataset/
│
├── README.pdf
├── submission_template.csv # A sample submission template
│
├── development_data/
│ ├── student_only/
│ │ └── train/ # .mp4 video files (Student isolated, 224x224)
│ ├── video_cascade/
│ │ └── train/ # .mp4 video files (Composite context, 1280x1280)
│ └── train.csv # Ground truth labels for training
│
├── evaluation_data/
│ ├── student_only/
│ │ └── test/ # .mp4 video files (Student isolated, 224x224)
│ └── video_cascade/
│ └── test/ # .mp4 video files (Composite context, 1280x1280)
│
└── metadata_optional/
├── personality_train.csv # Student personality metadata (Training set only)
└── instructor_personality.csv # Instructor personality metadata (Training set only)
*(Note: The evaluation labels are hidden and held securely on Codabench)*
Filename Structure
To map the video files to the personality metadata, participants must parse the video filenames. The filenames follow a consistent structure that contains both the instructor and student names.
Example filename: 05082021_Catherine_1_cardin_100.mp4
-
Instructor Name: Catherine (Located after the date)
-
Student Name: cardin (Located before the final clip index)
Label Format (train.csv)
The train.csv file contains the ground truth for both challenge tracks. The columns are structured as follows:
-
video_title: The exact filename of the 10-second video clip (matches identically across both view folders).
-
value: The continuous engagement score (Target for Track 1 - Regression).
-
label: The binary engagement classification (Target for Track 2 - Classification).
-
0 = Engaged
-
1 = Disengaged
Personality Metadata (optional)
Personality trait scores are provided for the training set via two files. The 10 personality dimensions measured are: Introversion, Trust, Emotional Stability, Low Conscientiousness, Sociability, Aesthetic Sensitivity, Criticalness, Conscientiousness, Neuroticism, and Creativity.
-
Student Personality (personality_train.csv)
-
student_name: The identifier linking the metadata to the student in the video clip (parsed from the filename as explained above).
-
Contains the 10 personality dimension scores for the students.
-
Instructor Personality (instructor_personality.csv)
-
Instructor Name: The identifier linking the metadata to the instructor in the video clip (parsed from the filename as explained above).
-
Contains the 10 personality dimension scores for the instructors.