CryoVirusDB: An Annotated Dataset for AI-Based Virus Particle Identification in Cryo-EM Micrographs
Authors/Creators
Description
With the advancements in instrumentation, image processing algorithms, and computational capabilities, single-particle cryo-electron microscopy (cryo-EM) has achieved atomic resolution in determining the 3D structures of viruses. The virus structures play a crucial role in studying their biological function and advancing the development of antiviral vaccines and treatments. Despite the effectiveness of artificial intelligence (AI) in general image processing, its development for identifying and extracting virus particles from cryo-EM micrographs has been hindered by the lack of manually labeled high-quality datasets. To fill the gap, we introduce CryoVirusDB, a labeled dataset containing the coordinates of expert-picked virus particles in cryo-EM micrographs. CryoVirusDB comprises 9,941 micrographs of 9 different viruses along with the coordinates of 339,398 labeled virus particles. CryoVirusDB comprises 9,941 micrographs from 9 datasets representing 7 distinct nonenveloped viruses exhibiting icosahedral or pseudoicosahedral symmetry, along with coordinates of 339,398 labeled virus particles. It can be used to train and test AI and machine learning (e.g., deep learning) methods to accurately identify virus particles in cryo-EM micrographs for building atomic 3D structural models for viruses.
Instructions to download and use the dataset are openly available at: https://github.com/BioinfoMachineLearning/CryoVirusDB
Files
Files
(22.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8d435cbb0186ecd25d68572bc5564a37
|
4.4 MB | Download |
|
md5:f0562804393fd8217a424c3a4f865418
|
932.6 kB | Download |
|
md5:1f26854f52e63318afd0515bcdeb3116
|
479.6 kB | Download |
|
md5:dde50f8369183005c4ae7e9a6be4f367
|
2.0 MB | Download |
|
md5:ca7dd2353d559ad3d5511fd6e3f66a37
|
4.3 MB | Download |
|
md5:f40df4cd97d028a6738fcddf24f4fc23
|
2.7 MB | Download |
|
md5:1fac55d66458f930057ec5bfb9d512b6
|
4.6 MB | Download |
|
md5:3021ce98e7dc2087366adcc795cb216a
|
1.7 MB | Download |
|
md5:db8947fb784be6e196ff1c18d26b74ac
|
1.4 MB | Download |
Additional details
Funding
- National Institute of Health
- Cryo-EM R01GM146340