Published June 2, 2026 | Version v1
Dataset Open

CourseFactory Workflow-Mining Dataset (De-Identified) v1

Description

A privacy-preserving, version-level benchmark for mining AI-assisted course-design workflows, derived from a production snapshot of the deployed CourseFactory system.

The release contains 14,598 de-identified course-version records with full-precision non-text features, structural-quality targets, and reproducible chronological / new-project split columns, plus a datasheet, a reproduction notebook, date-free aggregate outputs, and the exact de-identification builder. 

Rows are course versions, not persons. The data contains no raw prompts, generated text, direct identifiers, calendar timestamps, or public-availability flag. Calendar time is replaced by a monotone ordinal rank; project linkage uses salted release-local surrogate keys with no lookup back to production.

Files

coursefactory-workflow-mining-v1.1.zip

Files (14.2 MB)

Name Size Download all
md5:0926611154102bcd6b23a1fb57cf566a
14.2 MB Preview Download