Published September 11, 2023
| Version v1
Dataset
Open
Mimicking Clinical Trials with Synthetic Acute Myeloid Leukemia Patients Using Generative Artificial Intelligence
Creators
- Eckardt, Jan-Niklas1
- Hahn, Waldemar2
- Röllig, Christoph1
- Stasik, Sebastian1
- Platzbecker, Uwe3
- Müller-Tidow, Carsten4
- Serve, Hubert5
- Baldus, Claudia D.6
- Schliemann, Christoph7
- Schäfer-Eckart, Kerstin8
- Hanoun, Maher9
- Kaufmann, Martin10
- Burchert, Andreas11
- Thiede, Christian1
- Schetelig, Johannes1
- Sedlmayr, Martin12
- Bornhäuser, Martin1
- Wolfien, Markus12
- Middeke, Jan Moritz1
- 1. Department of Internal Medicine I, University Hospital Carl Gustav Carus, Technical University Dresden, Dresden, Germany
- 2. Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig, Germany
- 3. Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig, Germany
- 4. Department of Medicine V, University Hospital Heidelberg, Heidelberg, Germany
- 5. Department of Medicine 2, Hematology and Oncology, Goethe University Frankfurt, Frankfurt, Germany
- 6. Department of Hematology and Oncology, University Hospital Schleswig Holstein, Kiel, Germany
- 7. Department of Medicine A, University Hospital Münster, Münster, Germany
- 8. Department of Internal Medicine V, Paracelsus Medizinische Privatuniversität and University Hospital Nürnberg, Nürnberg, Germany
- 9. Department of Hematology, University Hospital Essen, Essen, Germany
- 10. Department of Hematology, Oncology and Palliative Care, Robert-Bosch-Hospital, Stuttgart, Germany
- 11. Department of Hematology, Oncology and Immunology, Philipps-University-Marburg, Marburg, Germany
- 12. Institute for Medical Informatics and Biometry, Technical University Dresden, Dresden, Germany
Description
We used two different methodologies of generative artificial intelligence, CTAB-GAN+ and normalizing flows (NFlow), to synthesize patient data based on 1606 patients with acute myeloid leukemia that were treated within four multicenter clinical trials. The resulting data set consists of 1606 synthetic patients for each of the models.
This dataset is associated with our publication "Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence" by Eckardt et al., npj Digital Medicine, 2024 (https://doi.org/10.1038/s41746-024-01076-x). If you use this dataset, please cite our paper.
Data Dictionary
NAME | LABEL | TYPE | CODELIST |
---|---|---|---|
AGE | age | num | in years |
AMLSTAT | AML status | char | de novo, sAML, tAML |
ASXL1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
ATRX | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
BCOR | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
BCORL1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
BRAF | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
CALR | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
CBL | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
CBLB | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
CDKN2A | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
CEBPA | CEBPA mutation | char | 0 = 'no mutation', 1 = 'mutation' |
CGCX | complex cytogenetic karyotype | char | 0 'No', 1 'Yes' |
CGNK | cytogenetic normal karyotype | char | 0 'No', 1 'Yes' |
CR1 | first complete remission | char | 0 = 'not achieved', 1 = 'achieved' |
CSF3R | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
CUX1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
DNMT3A | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
EFSSTAT | status variable for EFSTM | num | 0 'censored' 1 'event' |
EFSTM | event free survival time | num | in months |
ETV6 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
EXAML | extramedullary AML | char | 0 'No', 1 'Yes' |
EZH2 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
FBXW7 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
FLT3I | FLT3-ITD mutation status | char | 0 = 'no mutation', 1 = 'mutation' |
FLT3T | FLT3-TKD mutation status | char | 0 = 'no mutation', 1 = 'mutation' |
GATA2 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
GNAS | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
HB | hemoglobin | num | in mmol/l |
HRAS | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
IDH1 | IDH1 mutation status | char | 0 = 'no mutation', 1 = 'mutation' |
IDH2 | IDH2 mutation status | char | 0 = 'no mutation', 1 = 'mutation' |
IKZF1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
JAK2 | Jak2 Mutation | char | 0 = 'no mutation', 1 = 'mutation' |
KDM6A | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
KIT | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
KRAS | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
MPL | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
MYD88 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
NOTCH1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
NPM1 | NPM1 mutation status | char | 0 = 'no mutation', 1 = 'mutation' |
NRAS | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
OSSTAT | status variable for OSTM | num | 0 'censored' 1 'event' |
OSTM | overall survival time | num | in months |
PDGFRA | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
PHF6 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
PLT | platelet count | num | in 10⁶/l |
PTEN | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
PTPN11 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
RAD21 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
RUNX1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
SETBP1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
SEX | sex | char | f 'female', m 'male' |
SF3B1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
SMC1A | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
SMC3 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
SRSF2 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
STAG2 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
SUBJID | subject identifier | char | |
TET2 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
TP53 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
U2AF1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
WBC | white blood count | num | in 10⁶/l |
WT1 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
ZRSR2 | mutation indicator, NGS | num | 0 = 'no mutation', 1 = 'mutation' |
inv16_t16.16 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t8.21 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.6.9..p23.q34. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
inv.3..q21.q26.2. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
minus.5 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
del.5q. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.9.22..q34.q11. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
minus.7 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
minus.17 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.v.11..v.q23. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
abn.17p. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.9.11..p21.23.q23. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.3.5. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.6.11. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.10.11. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
t.11.19..q23.p13. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
del.7q. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
del.9q. | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
trisomy 8 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
trisomy 21 | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
minus.Y | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
minus.X | mutation indicator, cytogenetics | num | 0 = 'no mutation', 1 = 'mutation' |
Files
datadictionary.csv
Files
(1.2 MB)
Name | Size | Download all |
---|---|---|
md5:75a4c6e1cdaa6121888d60626240fe45
|
1.2 MB | Download |
md5:5bcc3b5f328fb128d827cb1a395fe825
|
5.9 kB | Preview Download |
Additional details
Related works
- Is published in
- Publication: 10.1038/s41746-024-01076-x (DOI)