Published June 7, 2021 | Version 1.0.0
Dataset Open

Sami-Trop: 12-lead ECG traces with age and mortality annotations

Description

SaMi-Trop is an NIH-funded prospective cohort of 1959 patients with chronic Chagas cardiomyopathy to evaluate whether a clinical prediction rule based on ECG, brain natriuretic peptide (BNP) levels, and other biomarkers can be useful in clinical practice. A subset of the SaMi-Trop dataset with annotations of age and mortality and the correspondent ECG traces is openly available here.

Contain two files `exams.csv` and `exams.hdf5`. The files contain information about the first ECG exam taken by 1631 patients.

  • "exams.csv": is a comma-separated values (csv) file containing the columns
    • "exam_id": id used for internal usages;
    • "age": patient age in years at the moment the of the exam;
    • "is_male": true if the patient is male, false if the patient is female;
    • "normal_ecg": True if the patient has a normal ECG;
    • "death": true if the patient dies in the follow-up time
    • "timey": if the patient dies it is the time to the death of the patient. If not, it is the follow-up time
  • "exams.hdf5": The HDF5 file containing a single dataset named `tracings`. This dataset is a `(1631, 4096, 12)` tensor. The first dimension corresponds to the 1631 different exams; the second dimension corresponds to the 4096 signal samples; the third dimension to the 12 different leads of the ECG exams in the following order: `{DI, DII, DIII, AVR, AVL, AVF, V1, V2, V3, V4, V5, V6}`. The signals are sampled at 400 Hz. Some signals originally have a duration of 10 seconds (10 * 400 = 4000 samples) and others of 7 seconds (7 * 400 = 2800 samples). In order to make them all have the same size (4096 samples), we fill them with zeros on both sizes. For instance, for a 7 seconds ECG signal with 2800 samples we include 648 samples at the beginning and 648 samples at the end, yielding 4096 samples that are then saved in the hdf5 dataset. 


The relation between neural-network predicted age and mortality is established in:
"Deep neural network estimated electrocardiographic-age as a mortality predictor"
Emilly M Lima, Antônio H Ribeiro, Gabriela MM Paixão, Manoel Horta Ribeiro, Marcelo M Pinto Filho, Paulo R Gomes, Derick M Oliveira, Ester C Sabino, Bruce B Duncan, Luana Giatti, Sandhi M Barreto, Wagner Meira Jr, Thomas B Schön, Antonio Luiz P Ribeiro. MedRXiv (2021) https://www.doi.org/10.1101/2021.02.19.21251232

The companion code can be found in: https://github.com/antonior92/ecg-age-prediction

The SaMi-Trop dataset is described in:

"Longitudinal study of patients with chronic Chagas cardiomyopathy in Brazil (SaMi-Trop project): a cohort profile" Clareci Silva Cardoso, Ester Cerdeira Sabino, Claudia Di Lorenzo Oliveira, Lea Campos de Oliveira, Ariela Mota Ferreira, Edécio Cunha-Neto, Ana Luiza Bierrenbach, João Eduardo Ferreira, Desirée Sant'Ana Haikal, Arthur L Reingold, Antonio Luiz P Ribeiro. BMJ Open (2016);6:e011181. doi: 10.1136/bmjopen-2016-011181

Files

exams.csv

Files (264.6 MB)

Name Size Download all
md5:6c9007a0427f7c3d9e1b6fb091231a67
87.7 kB Preview Download
md5:a7b65f115b0222ad2ecbc6a422496fdd
264.5 MB Preview Download