U-10: United-10 COVID19 CT Dataset
Description
This dataset supports the research detailed in the pre-print "Virtual Imaging Trials Improved the Transparency and Reliability of AI Systems in COVID-19 Imaging." The study employs both clinical and simulated CT data to evaluate AI models for COVID-19 diagnosis. By leveraging the Virtual Imaging Trials (VIT) framework, the research addresses reproducibility and generalizability issues prevalent in medical imaging AI models.
The dataset includes:
- Clinical CT Data: Drawn from 10 publicly available datasets, comprising over 12,000 volumes. These datasets span diverse populations, imaging protocols, and scanner configurations. Each of the 10 zip files contains pre-processed CT TFRecords (Train/Validation/Test) used in the study. Detailed information about the data sources, pre-processing steps, and the inclusion and exclusion criteria can be found in the manuscript (https://arxiv.org/abs/2308.09730).
- Simulated CT Data: Generated using computational anatomical phantoms from the XCAT model and imaged with the DukeSim simulation framework. This synthetic dataset allows controlled experiments that isolate the effects of imaging physics and patient-specific factors. Can be available upon request through Center For virtual Imaging Trial Portal at https://cvit.duke.edu/
The accompanying study analyzes the performance of lightweight convolutional neural networks on both real and synthetic data, comparing results across multiple internal and external validation scenarios. Insights into factors such as infection severity, imaging dose, and modality type are explored.
For further details, visit our Project Page: https://fitushar.github.io/ReviCOVID.github.io/
The full source code is available on gitHub and GitLab
GitHub: https://github.com/fitushar/CVIT_ReviCOVID19
GitLab : https://gitlab.oit.duke.edu/cvit-public/cvit_revicovid19
Citation: When using this dataset, please cite the manuscript () and the original data-source.
Tushar et al., "Virtual Imaging Trials Improved the Transparency and Reliability of AI Systems in COVID-19 Imaging", arXiv:2308.09730.
Contact: fakrulislam.tushar@duke.edu
Files
bimcv_tfrecords_96x160x160.zip
Files
(52.5 GB)
Name | Size | Download all |
---|---|---|
md5:ebae4041aa60c40e17a598219993199b
|
24.6 GB | Preview Download |
md5:bee0247f988d8580988c35c4f6f66099
|
3.7 GB | Preview Download |
md5:a47895d83d0afa3d12a3f8f759a48759
|
2.7 GB | Preview Download |
md5:f93126c36fb812492d272e2793c1b9e0
|
2.3 GB | Preview Download |
md5:d46441f00125a00e02038ad56434ae8f
|
1.8 GB | Preview Download |
md5:0c916cfc90806f429e7d590f55ab3ccb
|
2.9 GB | Preview Download |
md5:78a9850527e764804e61990f7427b66f
|
3.9 GB | Preview Download |
md5:6fecabea209b277a490c0d7ce900c9eb
|
1.0 GB | Preview Download |
md5:8a6f95d474c365b2892189736688efce
|
4.5 GB | Preview Download |
md5:349f0e4998182b11b82641d6ab9b4ab0
|
5.0 GB | Preview Download |
Additional details
Software
- Repository URL
- https://gitlab.oit.duke.edu/cvit-public/cvit_revicovid19
References
- Tushar, Fakrul Islam, Lavsen Dahal, Saman Sotoudeh-Paima, Ehsan Abadi, W. Paul Segars, Ehsan Samei, and Joseph Y. Lo. "Data diversity and virtual imaging in AI-based diagnosis: A case study based on COVID-19." arXiv preprint arXiv:2308.09730 (2023).
- Fakrul Islam Tushar, Ehsan Abadi, Saman Sotoudeh-Paima, Rafael B. Fricks, Maciej A. Mazurowski, W. Paul Segars, Ehsan Samei, Joseph Y. Lo, "Virtual vs. reality: external validation of COVID-19 classifiers using XCAT phantoms for chest computed tomography," Proc. SPIE 12033, Medical Imaging 2022: Computer-Aided Diagnosis, 1203305 (4 April 2022); https://doi.org/10.1117/12.2613010