Dataset Open Access
Anthony Bretaudeau;
Alexandre Cormier;
Stéphanie Robin;
Erwan Corre;
Laura Leroi
The data provided here are part of a Galaxy Training Network tutorial for genome annotation with funannotate.
Genome was assembled following the GTN Flye assembly tutorial, then masked with RepeatMasker.
RNASeq data: SRR8534859 reads were mapped to the genome using STAR (toolshed.g2.bx.psu.edu/repos/iuc/rgrnastar/rna_star/2.7.8a+galaxy0), then the bam was downsampled (10% with toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_DownsampleSam/2.18.2.1) to reduce the size of the dataset. Fastq files were then extracted from the resulting bam file (toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_SamToFastq/2.18.2.1).
SwissProt_subset.fasta is a subset of SwissProt proteins that are known to have some similarity with the genome (found using Diamond against the genome, then extracting sequences matching with e-value < 0.0001).
Name | Size | |
---|---|---|
genome_masked.fasta
md5:e28b3275a1a45057b87d193c1df6168b |
49.6 MB | Download |
rnaseq_R1.fq.gz
md5:ca50ac884a00ccb7553287b9d601ecd9 |
184.0 MB | Download |
rnaseq_R2.fq.gz
md5:df5de61de301484850e2244f06a459d1 |
221.4 MB | Download |
SwissProt_subset.fasta
md5:12d46b3ad9b1b2b5c73c14f6b19b4a9c |
2.9 MB | Download |
All versions | This version | |
---|---|---|
Views | 245 | 95 |
Downloads | 790 | 40 |
Data volume | 74.7 GB | 4.0 GB |
Unique views | 157 | 84 |
Unique downloads | 157 | 24 |