BoneMarrowWSI-PediatricLeukemia: A Comprehensive Dataset of Bone Marrow Aspirate Smear Whole Slide Images with Expert Annotations and Clinical Data in Pediatric Leukemia
Creators
- 1. Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
-
2.
Universitätsklinikum Erlangen
-
3.
Friedrich-Alexander-Universität Erlangen-Nürnberg
- 4. Medical Informatics, Friedrich-Alexander University of Erlangen-Nürnberg, Erlangen, Germany
- 5. PixelMed Publishing
- 6. Brigham and Women's Hospital Department of Radiology
-
7.
Fraunhofer Institute for Digital Medicine
- 8. Department of Pediatrics and Adolescent Medicine, University Hospital Erlangen, Erlangen, Germany
Description
The dataset comprises bone marrow aspirate smear WSI for 257 pediatric cases of leukemia, including acute lymphoid leukemia (ALL), acute myeloid leukemia (AML), and chronic myeloid leukemia (CML). The smears were prepared for the initial diagnosis (i.e., without prior treatment), stained in accordance with the Pappenheim method, and scanned at 40x magnification.
The images have been annotated with rectangular regions of interest (ROI) within the evaluable monolayer area, and a total of 47176 cell bounding box annotations have been placed within the regions of interest. Cells have been annotated by multiple experts in a consensus labeling approach with 49 distinct cell type classes. This consensus approach entailed that each cell was sequentially annotated by multiple individuals until each cell had been labeled by at least two individuals, and the majority class was assigned in at least half of all annotations for that image. The labels from all annotation sessions, as well as the final consensus class for each cell, are made available.
Additionally, clinical information (age group, sex, diagnosis) and laboratory data (blasts, white blood cell count, thrombocytes, LDH, uric acid, hemoglobin) are available for each case.
Files included
Pending peer review of an accompanying manuscript, currently, this dataset contains a sample of 2 bone marrow aspirate smear whole slide images (WSIs) with their cell annotations as a first sample of the dataset described above.
The entire dataset will be available in National Cancer Institute Imaging Data Commons (https://imaging.datacommons.cancer.gov). If you have any questions about the dataset please contact IDC support at support@canceridc.dev.
Both images and annotations are in DICOM format. All DICOM objects relating to the same smear are contained in the same folder. Clinical data are contained in the DICOM metadata.
In addition lab_values_sample.csv
contains the collected lab values for those two smears.
The attached files are named using the following convention using the corresponding DICOM tags: %PatientID-%Modality-%SeriesDescription-%SOPInstanceUID.dcm
.
For example, files corresponding to the patient A6BBC91AE73DD21C0533F735470A9CD0
contains the following 6 DICOM Slide Microscopy (SM) modality files each representing one level of the WSI pyramid.
A6BBC91AE73DD21C0533F735470A9CD0-SM-Bone marrow aspirate smear, May-Gruenwald-Giemsa stain-1.2.826.0.1.3680043.8.498.26060080718466278522952527845683544045.dcm
A6BBC91AE73DD21C0533F735470A9CD0-SM-Bone marrow aspirate smear, May-Gruenwald-Giemsa stain-1.2.826.0.1.3680043.8.498.62030007770863397357636084490828160953.dcm
A6BBC91AE73DD21C0533F735470A9CD0-SM-Bone marrow aspirate smear, May-Gruenwald-Giemsa stain-1.2.826.0.1.3680043.8.498.7859053050060184362011899525686475413.dcm
A6BBC91AE73DD21C0533F735470A9CD0-SM-Bone marrow aspirate smear, May-Gruenwald-Giemsa stain-1.2.826.0.1.3680043.8.498.85089919641169806925347867181900526802.dcm
A6BBC91AE73DD21C0533F735470A9CD0-SM-Bone marrow aspirate smear, May-Gruenwald-Giemsa stain-1.2.826.0.1.3680043.8.498.88236263312726593722497600529137414206.dcm
A6BBC91AE73DD21C0533F735470A9CD0-SM-Bone marrow aspirate smear, May-Gruenwald-Giemsa stain-1.2.826.0.1.3680043.8.498.95738688685525699076567938918194597802.dcm
Cell annotations with labels from each annotation session in the labeling process are stored in DICOM Bulk Annotations (ANN modality) objects:
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Cell bounding boxes with cell type labels; annotation session: 0-1.2.826.0.1.3680043.10.511.3.12557519480564734942303269163896694.dcm
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Cell bounding boxes with cell type labels; annotation session: 1-1.2.826.0.1.3680043.10.511.3.6987603211883801558525207593845155.dcm
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Cell bounding boxes with cell type labels; annotation session: 2-1.2.826.0.1.3680043.10.511.3.1120336965786739278582883135803528.dcm
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Cell bounding boxes with cell type labels; annotation session: 3-1.2.826.0.1.3680043.10.511.3.6408695988615311222439105226576101.dcm
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Cell bounding boxes with cell type labels; annotation session: 4-1.2.826.0.1.3680043.10.511.3.52627683252818668316930590707706798.dcm
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Cell bounding boxes with consensus cell type labels-1.2.826.0.1.3680043.10.511.3.1666264618985716614248499039136585.dcm
A6BBC91AE73DD21C0533F735470A9CD0-ANN-Monolayer regions of interest for cell classification-1.2.826.0.1.3680043.10.511.3.6350792333250462425535421489809492.dcm
Acknowledgments
The authors thank Stefanie Barnickel, Nathalie Dollmann, Tatjana Flamann, Meinolf Suttorp, and Perdita Weller for the labelling of the cells.
The authors thank the following institutions for supplying BMA smears: University Hospital Augsburg (Univ.-Prof. Dr. Dr. med. Michael Frühwald), Charité Berlin - ALL-REZ BFM Study Group (PD Dr. med. Arend von Stackelberg), University Hospital at the TU Dresden (Prof. Dr. med. Meinolf Suttorp), University Hospital Essen - AML-BFM Study Group (Prof. Dr. Dirk Reinhardt), Technical University of Munich (Prof. Dr. med. Irene Teichert-von Lüttichau), University Hospital Würzburg (Prof. Dr. med. Matthias Eyrich).
This study was supported by a grant from the German Federal Ministry of Education and Research (FKZ: 031L0262A; BMDeep)
Preparation of the Dataset for publication was partly supported by Federal funds from the National Cancer Institute, National Institutes of Health (Task Order No. HHSN26110071 under Contract HHSN261201500003l).
Files
lab_values_sample.csv
Files
(13.3 GB)
Name | Size | Download all |
---|---|---|
md5:c8ec67289ca84bdb7f64eff6a654b930
|
44.9 kB | Download |
md5:3fd7e7feda6649287660d6fb8601669b
|
44.2 kB | Download |
md5:24839b95634043990786f4ea885575eb
|
20.6 kB | Download |
md5:f944e08b18048559147f9dcb15719e69
|
15.6 kB | Download |
md5:ac577457a191bd56bb6afcd8ff27dd6e
|
11.9 kB | Download |
md5:e985304acee8ca1940a29d71172f6d91
|
44.9 kB | Download |
md5:c58d855c9892a311fb37f9bc6c5862d2
|
5.8 kB | Download |
md5:c02d73ef74e3ce798b0db431bc5dfcd9
|
169.8 kB | Download |
md5:39bab31eae69435945506c87bb99dd3b
|
5.4 GB | Download |
md5:3f002f98201a3472f01cc82be39eb736
|
1.8 MB | Download |
md5:59b893d27212522ad335c486007a68f5
|
321.1 MB | Download |
md5:bf1a34e4a1e5a04f3f0569d647a3f139
|
29.7 MB | Download |
md5:a2b9e77b4efd576043ae7592030caf4d
|
84.6 kB | Download |
md5:e02c72888c202aeec09db5bf69f11729
|
174.1 kB | Download |
md5:ce33b380d65d24142d173e6650fa18db
|
35.8 kB | Download |
md5:6b042475ce193034c0784615488f2680
|
38.1 kB | Download |
md5:a3f3ed991d78442057e519916a271e98
|
18.1 kB | Download |
md5:99a5ca8153e35aa7b303b0e53c3a57f5
|
15.6 kB | Download |
md5:037d2fb0cc299fb9af9c9b8ea9c1c8fc
|
14.8 kB | Download |
md5:778cf0fee312880f4239041532b7841e
|
36.6 kB | Download |
md5:f185d896ec0161c1dea8baf73cabe0b2
|
5.8 kB | Download |
md5:be8e7583c302c0047190cc699db03a60
|
112.8 kB | Download |
md5:2d1378eaa618b5bb99cb48eac301f69d
|
260.9 kB | Download |
md5:56508f02f2772145f0c13d961b8c597b
|
276.2 kB | Download |
md5:fdccb52878e6b3a5698d215816c49381
|
6.8 GB | Download |
md5:4cb7280d25d29626229e09355ae567d3
|
4.1 MB | Download |
md5:90d0b071797b009e68d31239a7305694
|
621.5 MB | Download |
md5:30a9e9632b4bcadfd4886e2cba143a67
|
60.6 MB | Download |
md5:d78afa2fe9b7914e2dbf9d03deae1752
|
351 Bytes | Preview Download |