Published January 19, 2023 | Version v1
Dataset Open

Vertebrate gene family trees

  • 1. University of Glasgow

Description

This is a dataset of phylogenies for 118 vertebrate gene families, used in two papers published by James Cotton and Roderic Page in 2002. The folder named "genes.zip" contains FASTA and NEXUS sequence files, and Newick-format tree files for each gene family. There is a PDF file "suppl.pdf" listing the names of each family. The file "final_dataset.gml" contains a graph of the taxonomic overlap in the gene trees. All 118 gene trees have been combined into the file "final_dataset.gtr" which is a NEXUS file with custom blocks recognised by GeneTree.

HOVERGEN FAMILY CODE GENE FAMILY NAME
FAM000030A wnt 5
FAM000030B wnt 7
FAM000030C wnt 11
FAM000030D wnt/int 1
FAM000030E wnt 4
FAM000030F wnt 10/12
FAM000030G wnt 3
FAM000030H wnt 2
FAM000030I wnt 8
FAM000105 rhodopsin
FAM000214 beta-B globin
FAM000033 protamine 1
FAM000370 PrP a prion-protein
FAM000014 growth hormone
FAM000556 Rag-1 recombination activation gene
FAM001493 c-mos proto-oncogene
FAM001462 tyrosine kinase / yes / fyn / src / lck
FAM001041 metallothionein
FAM000364 Ldh-2 lactate dehydrogenase-B (EC
FAM000215 alpha globin
FAM000016 placental lactogen - prolactin
FAM000008 insulin
FAM000824 phosphoglycerate kinase
FAM000550 neurotrophin-4 (NT-4)
FAM001478 tyrosine kinase receptor, c-fms oncogene
FAM000173A guanine nucleotide-binding protein
FAM000173B transducin alpha
FAM000502 cytochrome P-450 aromatase
FAM000192 alpha-fetoprotein / serum albumin
FAM000627 neurone-specific enolase
FAM001232 preprotrypsin (ta)
FAM000664 complement component 3 (C3)
FAM000175 Ras
FAM000248 alpha B-crystallin
FAM000006 insulin-like growth factor II
FAM000639 transthyretin (prealbumin)
FAM001303 butylcholinesterase (BCHE)
FAM000242 connexin / gap junction protein
FAM000058 dopamine D1 receptor
FAM000055 beta-3-adrenergic receptor .
FAM000131 ATPase (Na+K+, H+K+)
FAM003983 preprogastrin
FAM000353 vasopressin
FAM000152 acetylcholine receptor
FAM001327 peripherin, desmin, vimentin, GFAP
FAM003199 (C57BL/6J)
FAM002881 cytochrome P-450, 17a-hydroxylase (CYP17)
FAM002789 (MUAHRB-1) Ah-receptor (Ah)
FAM001461 tropomyosin
FAM000385 Y3 peptide supply factor
FAM000378 pancreatic polypeptide, neuropeptide Y
FAM001329 cytokeratin
FAM001619 amelogenin (enamel-specific protein)
FAM000286 glucagon
FAM000330 tissue inhibitor of
FAM000475 lipophilin
FAM001060 ornithine carbamoyltransferase
FAM001328 neurofilament
FAM001370 peroxisome proliferator
FAM000371 somatostatin
FAM001607 Wilms tumor assocated protein (WT1)
FAM000495 aldolase A, B, C
FAM001365 liver receptor homologous protein (LRH-1)
FAM000672 ribosomal protein S4,
FAM000135 Na, K-ATPase beta-1 subunit
FAM000271 enkephalin : 1 2.
FAM000799 RING10
FAM001664 glutamate decarboxylase
FAM000617 creatine kinase
FAM001337 amyloid beta protein precursor
FAM000274 basic fibroblast growth factor (bFGF)
FAM001108 Sl-d mutant allele kit ligand (KL)
FAM000504 anion exchange protein 3
FAM001239 prothrombin
FAM001360 high mobility group proteins HMG1 and HMG2
FAM000534 glutamine synthetase
FAM001053 nucleoside diphosphate kinase
FAM001390 low density lipoprotein receptor LDLR
FAM001464 Cek6 receptor tyrosine kinase
FAM000801 manganese-containing superoxide
FAM003946 Six2 / Six1 mRNA
FAM001606 Ikaros binding protein (Ikaros)
FAM000350 SPARC protein
FAM001339 calcium-binding protein
FAM000553 pyruvate kinase
FAM000604 t complex polypeptide 1 (Tcp-1-a)
FAM002988 mSlo
FAM001266 transcription factor / hepatocyte nuclear factor
FAM000453 terminal deoxynucleotidyltransferase
FAM001642 transformation associated protein p53
FAM000300A glucose-regulated protein 78 / HSP70 PART A
FAM000300B glucose-regulated protein 78 / HSP70 PART B
FAM000492A alpha actin etc
FAM000492B beta actin etc
FAM000649 Myelin Basic Protein
FAM000843 ribosomal protein S4
FAM000170 atrial natriuretic protein
FAM000871A tyrosinase
FAM000871B tyrosinase related protein 1
FAM001605 ZFX put. transcription activator
FAM001479 fibroblast growth factor
FAM000160 pro-opiomelanocortin (POMC)
FAM001644 fibrinogen alpha subunit
FAM000564 SNAP-25
FAM000266 nitric oxide synthase
FAM004159 chondroitin-6 sulfotransferase
FAM000800 Lmp-2 (LMPq) proteasome subunit
FAM001595 factor B
FAM001134 zona pellucida (ZP)
FAM001632 stromelysin-3
FAM001366 steroid receptor (TR2-9)
FAM000526 c-ski protein
FAM002463 thymosin beta 4 peptide
FAM000904 sequence-specific DNA-binding protein (AP-2)
FAM001465 T-cell specific tyrosine kinase (ltk)
FAM000567 triosephosphate isomerase
FAM006113 DNA-dependent RNA polymerase III, large subunit
FAM001733 DNA-dependent RNA polymerase II

Files

final_dataset_connections.pdf

Files (1.6 MB)

Name Size Download all
md5:a620d858a53a9d3da6bf4be5fcf0dfdb
24.1 kB Download
md5:29f30b3b644643746aa97f1284571f92
128.3 kB Download
md5:143a4df97a2f5b578a0f29930aa2e742
82.8 kB Preview Download
md5:b503d50a600d70c68369218ff718490c
1.3 MB Preview Download
md5:dad25bf4dc4f90142cefdee726f4dcdd
41.5 kB Preview Download

Additional details

Related works

Is supplement to
Journal article: 10.1098/rspb.2002.2074 (DOI)
Conference paper: 10.1142/9789812799623_0050 (DOI)

References