There is a newer version of the record available.

Published October 22, 2025 | Version 1.0.0
Dataset Open

Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods

Description

This database provides a resource for machine-learning-based anomaly detection (AD) in chemical processes. It includes data from 119 experimental runs on a laboratory-scale batch distillation plant across diverse operating conditions and mixtures, with paired fault-free runs and deliberately induced anomalies. For each experiment, we provide multivariate time-series from sensors and actuators with measurement-uncertainty estimates, complemented by unconventional modalities: online benchtop NMR concentration profiles, image, and audio recordings. Every anomalous experiment is richly documented with extensive metadata and expert annotations. This metadata captures both the presence and causes of anomalies. The data are organized in a structured, ready-to-use format and made freely available to support benchmarking and development of advanced AD methods. By linking anomalies to their underlying causes, the database also enables research on interpretable and explainable machine learning (ML), as well as strategies for anomaly mitigation.

Files

Batch_Distillation_Plant_M-202210.zip

Files (85.8 GB)

Name Size Download all
md5:5791ecea6475b9e9cbec31c237b1bfae
85.8 GB Preview Download

Additional details

Related works

Is described by
Publication: arXiv:2510.18075 (arXiv)

Funding

Deutsche Forschungsgemeinschaft
Deep Learning on Sparse Chemical Process Data 459419731