Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published September 21, 2021 | Version v1
Dataset Open

Gene family data from the PhyloGenes (release version 3.2, phylogenes.org)

Description

The data files were generated from the PhyloGenes 3.2 release (see release notes here).

About the two zip files: 

1. phyloXML.zip (there is no change from the PhyloGenes 3.0 release)

PhyloGenes gene family trees in PhyloXML format, one file per family (e.g. <family_ID>.xml).

The following information is provided for each node of a tree:
1) leaf node:
branch length
name <gene_id>
taxonomy scientific_name
sequence accession <UniProt ID>

2) non-leaf node:
branch length
events <duplication or speciation>


2. CSV.zip

Functional information of family members in CSV format, one file per family (e.g. <family_ID>.csv). 

A CSV file includes the following columns:
Uniprot ID
Gene <Gene name. If none then Gene ID>
Gene ID
Gene name
Organism
Subfamily name

The columns displayed after 'Subfamily name', if any, are GO annotations. Each column is a GO molecular function or biological process term that is annotated to at least one member of the gene family AND the annotation is supported by an experimental evidence (indicated by 'EXP') or phylogenetic inference (indicated by 'IBA'). A '0' indicates absence of either annotations.

Files

phylogenes_csv_3_2.zip

Files (103.0 MB)

Name Size Download all
md5:cd96676ebcc9874e1f18a2dd396cb8c0
33.5 MB Preview Download
md5:dccffccc1478cca9947a4b124f3eeae4
69.5 MB Preview Download