Published March 4, 2023
| Version 1.0
Dataset
Open
Additional data and code for "You can move, but you can't hide: identification of mobile genetic elements with geNomad"
Creators
- 1. DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Description
- benchmark_data: Data used to train and evaluate the classification models.
- giant_virus_data: Sequences and metadata of giant viruses identified in public metagenomes.
- neural_network_training: Code used to train geNomad's neural network-based classification model.
- provirus_data: Data used to train and evaluate the conditional random field model employed by geNomad to identify provirus regions.
- reference_sequences: Sequences of chromosomes, plasmids, and viruses that were used to build geNomad's marker dataset and to generate the training data for the classification models.
Files
genomad_supplementary_data_code.zip
Files
(8.6 GB)
Name | Size | Download all |
---|---|---|
md5:25ea62a5b626bdfd6790a36cb3b310b7
|
8.6 GB | Preview Download |