Published April 28, 2021
| Version v1
Dataset
Open
Stramenopile dataset for positive selection
Description
This repository contains the data (sequences, annotations, and intermediary files) collected and produced during the preparation of the following preprint: https://doi.org/10.1101/2021.01.12.426341. Description of files:
- The samples.zip file contains for each taxa:
- The functional annotations from Interproscan with extension ".tsv" in a TSV format.
- The genome annotations with extension ".gff" in a gff3 format.
- The protein sequences with extension ".faa" in a fasta format.
- The corresponding coding DNA sequences with extension ".fna" in a fasta format.
- The all_ann.csv file contains all annotations from the tested genes with added information for positive selection and orthology status in a CSV format.
- The go_mapping.csv file contains the mapping of GO terms to protein accessions of the dataset in a CSV format.
- The protein_families.poff.tsv contains the proteinortho output file corresponding to the classification of the genes in the dataset into ortholog groups in a TSV format.
- The families.zip file contains the intermediary files for each of the selected orthogroups:
- Tree files in newick format in the folder "trees".
- Protein sequences in the folder "faas".
- Coding DNA sequences in the folder "fnas".
- Log outputs from the FUBAR analysis in the folder "logs".
- Codon alignments in the folder "codon_alns"
- The families_fubar.zip file contains the same files as before for the subset of orthogroups with a positive result in the FUBAR analysis plus log output from the aBSREL analysis in the log folder.
Files
all_ann.csv
Files
(21.1 GB)
Name | Size | Download all |
---|---|---|
md5:68d82b41788622e8c978e1f2737a0966
|
18.5 GB | Preview Download |
md5:2d7afbe1d40fcecd7aa0c2ac5e0e2f21
|
529.8 MB | Preview Download |
md5:3f37f12a8b947561a378040aaeac0b40
|
57.2 MB | Preview Download |
md5:fe2f9c20a2edd02bd3453148b7ff4940
|
37.8 MB | Download |
md5:cbdebbf18ded0cbd57d540605c14e9b2
|
2.0 GB | Preview Download |
Additional details
Related works
- Is cited by
- Preprint: 10.1101/2021.01.12.426341v3 (DOI)