Training data for 'Genome annotation with Funannotate' tutorial (Galaxy Training Material)
- 1. INRAE
- 2. Ifremer
- 3. CNRS
Description
The data provided here are part of a Galaxy Training Network tutorial for genome annotation with funannotate.
Genome was assembled following the GTN Flye assembly tutorial, then masked with RepeatMasker.
RNASeq data: SRR8534859 reads were mapped to the genome using STAR (toolshed.g2.bx.psu.edu/repos/iuc/rgrnastar/rna_star/2.7.8a+galaxy0), then the bam was downsampled (15% with toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_DownsampleSam/2.18.2.1) to reduce the size of the dataset. Fastq files were then extracted from the resulting bam file (toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_SamToFastq/2.18.2.1).
SwissProt_subset.fasta is a subset of SwissProt proteins that are known to have some similarity with the genome (found using Diamond against the genome, then extracting sequences matching with e-value < 0.0001).
Files
Files
(649.8 MB)
Name | Size | Download all |
---|---|---|
md5:7e9e0dbb83eea353253aca07faf45f15
|
49.8 MB | Download |
md5:fade08bc61483f040403363f542d6e6d
|
270.4 MB | Download |
md5:006fa10efbed947287d9c02227431296
|
326.8 MB | Download |
md5:12d46b3ad9b1b2b5c73c14f6b19b4a9c
|
2.9 MB | Download |