Published March 27, 2026 | Version v1
Dataset Open

Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images

Authors/Creators

Description

Data Description

This dataset represents a digitized derivative of the publicly available PTB-XL dataset, explicitly constructed to evaluate and train robust models for ECG analysis. The data generation pipeline consists of the following key stages:

  • ECG Image Generation: Synthetic ECG images were first rendered from the original PTB-XL 1-dimensional time-series waveforms using the open-source ecg-image-kit tool.

  • Signal Digitization: The generated images were subsequently reverse-digitized back into 1-dimensional signals utilizing the state-of-the-art digitization algorithm that achieved first place in the 2024 Computing in Cardiology (CinC) Challenge.

  • Label Alignment: The resulting digitized signals are perfectly aligned with the ground-truth clinical diagnostic labels provided in the original PTB-XL dataset.

Detailed methodologies regarding the image generation formats, the digitization process, and their specific applications are comprehensively described in our manuscript, Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images.

Files

Files (146.0 MB)

Name Size Download all
md5:7e5b361ad4a08b24f8e91aaab9e04791
146.0 MB Download