Viro3D Dataset – Part 1: Metadata, ColabFold Predictions and Foldseek database
Description
This repository contains Part 1 of the dataset associated with the manuscript “Viro3D: a comprehensive database of virus protein structure predictions” by Ulad Litvin, Spyros Lytras, Alexander Jack, David L. Robertson, Joseph Hughes, and Joe Grove.
The dataset includes:
- Metadata: protein and species lists in CSV format (viro3d_metadata.tar.gz)
-
Relaxed ColabFold predictions in PDB format (colabfold_pdb.tar.gz)
-
ColabFold pLDDT and pTM confidence scores in JSON format (colabfold_json_scores.tar.gz)
-
ColabFold multiple sequence alignments in A3M format (colabfold_msa.tar.gz)
-
Viro3D Foldseek structural search database (foldseekViro3D.tar.gz)
-
Foldseek-derived structural protein clusters (viro3d_protein_clusters.tar.gz)
-
Foldseek-derived structural similarity network (viro3d_protein_network.tar.gz)
- Foldseek-based annotation expansion of protein functions (viro3d_annotation_expansion.tar.gz)
Files
Files
(47.9 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:7450435f3682ddb8890359f6071fd080
|
31.6 GB | Download |
|
md5:df16642fc9f2233f73fd76e41f9f6c48
|
7.8 GB | Download |
|
md5:f4b2e1a992cbea448d5aa75caced5646
|
7.9 GB | Download |
|
md5:ed46740700f169e5cfadd38573c9ff56
|
568.7 MB | Download |
|
md5:01f6407c563b345bd438ed10db16fb41
|
12.3 MB | Download |
|
md5:70e8e3b7fddd37170e14522574c72239
|
18.5 MB | Download |
|
md5:d49fed63243e3140a6859131d3080aaf
|
3.9 MB | Download |
|
md5:f43d0b8f63b6fce2057d30960ded14ff
|
5.9 MB | Download |