Published June 17, 2014 | Version v2
Dataset Open

A New Model and Dating for the Evolution of Complex Plastids of Red Alga Origin

  • 1. Faculty of Biotechnology, University of Wroclaw

Description

The zip files includes alignments of protein sequences in fasta format (SequenceAlignments.zip) and their concatenated sets used in our research project (ConcatenatedAlignments.zip). Additionally, we included raw (unaligned) protein sequences in fasta format (RawSequences.zip). In total, we employed 97 amino acid sequences of conserved plastid-encoded proteins, carefully selected from the NCBI reference sequence database (https://www.ncbi.nlm.nih.gov/refseq/),  and GenBank (https://www.ncbi.nlm.nih.gov/genbank/), representing 112 organisms. Our dataset included 111 eukaryotes carrying red-alga derived plastids and the closest plastid cyanobacterial relative Gloeomargarita lithophora Alchichica D10. We performed independent alignments of each homologous protein group using a slow and accurate L-INS-i algorithm implemented in MAFFT v7.429 (https://doi.org/10.1093/molbev/mst010). The resulting multiple sequence alignments were carefully assessed using AliView (http://dx.doi.org/10.1093/bioinformatics/btu531), and phylogenetically informative sites were selected through trimAl https://doi.org/10.1093/bioinformatics/btp348and ClipKIT (https://doi.org/10.1371/journal.pbio.3001007). The trimmed alignments were concatenated into supermatrices using SequenceMatrix 1.8 (https://doi.org/10.1111/j.1096-0031.2010.00329.x)to generate comprehensive datasets for phylogenetic and molecular clock analyses. We also generated a supermatrix composed of untrimmed alignments.

Files

ConcatenatedAlignments.zip

Files (10.1 MB)

Name Size Download all
md5:1cb7c3f39be065f7c2439134585d1172
8.4 MB Preview Download
md5:d17e7fe6d4ff2d53c511d7914b8f88fe
133.9 kB Preview Download
md5:d64758f98ba87269ce8d406f8fc8cd3d
782.3 kB Preview Download
md5:e47423d3d260e5c7305ad003804f7b5d
825.4 kB Preview Download