Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published September 11, 2023 | Version v1
Dataset Open

Mimicking Clinical Trials with Synthetic Acute Myeloid Leukemia Patients Using Generative Artificial Intelligence

  • 1. Department of Internal Medicine I, University Hospital Carl Gustav Carus, Technical University Dresden, Dresden, Germany
  • 2. Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig, Germany
  • 3. Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig, Germany
  • 4. Department of Medicine V, University Hospital Heidelberg, Heidelberg, Germany
  • 5. Department of Medicine 2, Hematology and Oncology, Goethe University Frankfurt, Frankfurt, Germany
  • 6. Department of Hematology and Oncology, University Hospital Schleswig Holstein, Kiel, Germany
  • 7. Department of Medicine A, University Hospital Münster, Münster, Germany
  • 8. Department of Internal Medicine V, Paracelsus Medizinische Privatuniversität and University Hospital Nürnberg, Nürnberg, Germany
  • 9. Department of Hematology, University Hospital Essen, Essen, Germany
  • 10. Department of Hematology, Oncology and Palliative Care, Robert-Bosch-Hospital, Stuttgart, Germany
  • 11. Department of Hematology, Oncology and Immunology, Philipps-University-Marburg, Marburg, Germany
  • 12. Institute for Medical Informatics and Biometry, Technical University Dresden, Dresden, Germany

Description

We used two different methodologies of generative artificial intelligence, CTAB-GAN+ and normalizing flows (NFlow), to synthesize patient data based on 1606 patients with acute myeloid leukemia that were treated within four multicenter clinical trials. The resulting data set consists of 1606 synthetic patients for each of the models.

This dataset is associated with our publication "Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence" by Eckardt et al., npj Digital Medicine, 2024 (https://doi.org/10.1038/s41746-024-01076-x). If you use this dataset, please cite our paper.

 

Data Dictionary

NAME LABEL TYPE CODELIST
AGE age num in years
AMLSTAT AML status char de novo, sAML, tAML
ASXL1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
ATRX mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
BCOR mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
BCORL1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
BRAF mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
CALR mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
CBL mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
CBLB mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
CDKN2A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
CEBPA CEBPA mutation char 0 = 'no mutation', 1 = 'mutation'
CGCX complex cytogenetic karyotype char 0 'No', 1 'Yes'
CGNK cytogenetic normal karyotype char 0 'No', 1 'Yes'
CR1 first complete remission char 0 = 'not achieved', 1 = 'achieved'
CSF3R mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
CUX1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
DNMT3A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
EFSSTAT status variable for EFSTM num 0 'censored' 1 'event'
EFSTM event free survival time num in months
ETV6 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
EXAML extramedullary AML char 0 'No', 1 'Yes'
EZH2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
FBXW7 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
FLT3I FLT3-ITD mutation status char 0 = 'no mutation', 1 = 'mutation'
FLT3T FLT3-TKD mutation status char 0 = 'no mutation', 1 = 'mutation'
GATA2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
GNAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
HB hemoglobin num in mmol/l
HRAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
IDH1 IDH1 mutation status char 0 = 'no mutation', 1 = 'mutation'
IDH2 IDH2 mutation status char 0 = 'no mutation', 1 = 'mutation'
IKZF1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
JAK2 Jak2 Mutation char 0 = 'no mutation', 1 = 'mutation'
KDM6A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
KIT mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
KRAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
MPL mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
MYD88 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
NOTCH1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
NPM1 NPM1 mutation status char 0 = 'no mutation', 1 = 'mutation'
NRAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
OSSTAT status variable for OSTM num 0 'censored' 1 'event'
OSTM overall survival time num in months
PDGFRA mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
PHF6 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
PLT platelet count num in 10⁶/l
PTEN mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
PTPN11 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
RAD21 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
RUNX1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
SETBP1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
SEX sex char f 'female', m 'male'
SF3B1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
SMC1A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
SMC3 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
SRSF2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
STAG2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
SUBJID subject identifier char  
TET2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
TP53 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
U2AF1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
WBC white blood count num in 10⁶/l
WT1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
ZRSR2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'
inv16_t16.16 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t8.21 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.6.9..p23.q34. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
inv.3..q21.q26.2. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
minus.5 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
del.5q. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.9.22..q34.q11. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
minus.7 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
minus.17 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.v.11..v.q23. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
abn.17p. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.9.11..p21.23.q23. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.3.5. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.6.11. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.10.11. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
t.11.19..q23.p13. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
del.7q. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
del.9q. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
trisomy 8 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
trisomy 21 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
minus.Y mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
minus.X mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

Files

datadictionary.csv

Files (1.2 MB)

Name Size Download all
md5:75a4c6e1cdaa6121888d60626240fe45
1.2 MB Download
md5:5bcc3b5f328fb128d827cb1a395fe825
5.9 kB Preview Download

Additional details

Related works

Is published in
Publication: 10.1038/s41746-024-01076-x (DOI)