SteelBench v1.0: A Benchmark Dataset for Steel Mechanical Property Prediction with Grade-Shift Evaluation
Authors/Creators
Description
SteelBench is an open benchmark for steel mechanical property prediction that links heat-level chemistry, heat-treatment parameters, and tensile properties with grade identity and data-origin labels. The dataset contains 1,636 samples across 594 steel grades and 17 steel families, aggregated from public and semi-public sources: MMPDS, NIMS, EMK, Kaggle, and laboratory measurements.
Two variants are included:
- steelbench_core.csv — strict publication variant (original reported values only)
- steelbench_full.csv — filled training variant (missing heat-treatment parameters imputed using documented assumptions)
Key features:
- 11 input features: C, Mn, Si, Cr, Ni, Mo, V, Cu, Al, austenitize_T, temper_T
- Targets: tensile_strength, yield_strength, elongation
- Grade-shift evaluation protocols: RandomKFold, GKF-grade, LOFO (Leave-One-Family-Out), LOSO (Leave-One Source-Out)
- Data-origin (provenance) labels for each filled field
Associated paper: "SteelBench: A Physics-Aware Benchmark for Steel Mechanical Property Prediction" (under review at KDD 2026).
Reference code: https://github.com/cornada/steelbench
Files
croissant.json
Additional details
Software
- Repository URL
- https://github.com/cornada/steelbench
- Programming language
- Python