Published November 11, 2024 | Version v1
Image Open

U-10: United-10 COVID19 CT Dataset

  • 1. ROR icon Duke University

Description

This dataset supports the research detailed in the pre-print "Virtual Imaging Trials Improved the Transparency and Reliability of AI Systems in COVID-19 Imaging." The study employs both clinical and simulated CT data to evaluate AI models for COVID-19 diagnosis. By leveraging the Virtual Imaging Trials (VIT) framework, the research addresses reproducibility and generalizability issues prevalent in medical imaging AI models.

The dataset includes:

  • Clinical CT Data: Drawn from 10 publicly available datasets, comprising over 12,000 volumes. These datasets span diverse populations, imaging protocols, and scanner configurations. Each of the 10 zip files contains pre-processed CT TFRecords (Train/Validation/Test)  used in the study. Detailed information about the data sources, pre-processing steps, and the inclusion and exclusion criteria can be found in the manuscript (https://arxiv.org/abs/2308.09730).
  • Simulated CT Data: Generated using computational anatomical phantoms from the XCAT model and imaged with the DukeSim simulation framework. This synthetic dataset allows controlled experiments that isolate the effects of imaging physics and patient-specific factors. Can be available upon request through Center For virtual Imaging Trial Portal at https://cvit.duke.edu/

The accompanying study analyzes the performance of lightweight convolutional neural networks on both real and synthetic data, comparing results across multiple internal and external validation scenarios. Insights into factors such as infection severity, imaging dose, and modality type are explored.

For further details, visit our Project Page: https://fitushar.github.io/ReviCOVID.github.io/
The full source code is available on gitHub and GitLab
GitHub: https://github.com/fitushar/CVIT_ReviCOVID19
GitLab : https://gitlab.oit.duke.edu/cvit-public/cvit_revicovid19

Citation:  When using this dataset, please cite the manuscript () and the original data-source.
Tushar et al., "Virtual Imaging Trials Improved the Transparency and Reliability of AI Systems in COVID-19 Imaging", arXiv:2308.09730.

Contact:  fakrulislam.tushar@duke.edu

Files

bimcv_tfrecords_96x160x160.zip

Files (52.5 GB)

Name Size Download all
md5:ebae4041aa60c40e17a598219993199b
24.6 GB Preview Download
md5:bee0247f988d8580988c35c4f6f66099
3.7 GB Preview Download
md5:a47895d83d0afa3d12a3f8f759a48759
2.7 GB Preview Download
md5:f93126c36fb812492d272e2793c1b9e0
2.3 GB Preview Download
md5:d46441f00125a00e02038ad56434ae8f
1.8 GB Preview Download
md5:0c916cfc90806f429e7d590f55ab3ccb
2.9 GB Preview Download
md5:78a9850527e764804e61990f7427b66f
3.9 GB Preview Download
md5:6fecabea209b277a490c0d7ce900c9eb
1.0 GB Preview Download
md5:8a6f95d474c365b2892189736688efce
4.5 GB Preview Download
md5:349f0e4998182b11b82641d6ab9b4ab0
5.0 GB Preview Download

Additional details

References

  • Tushar, Fakrul Islam, Lavsen Dahal, Saman Sotoudeh-Paima, Ehsan Abadi, W. Paul Segars, Ehsan Samei, and Joseph Y. Lo. "Data diversity and virtual imaging in AI-based diagnosis: A case study based on COVID-19." arXiv preprint arXiv:2308.09730 (2023).
  • Fakrul Islam Tushar, Ehsan Abadi, Saman Sotoudeh-Paima, Rafael B. Fricks, Maciej A. Mazurowski, W. Paul Segars, Ehsan Samei, Joseph Y. Lo, "Virtual vs. reality: external validation of COVID-19 classifiers using XCAT phantoms for chest computed tomography," Proc. SPIE 12033, Medical Imaging 2022: Computer-Aided Diagnosis, 1203305 (4 April 2022); https://doi.org/10.1117/12.2613010