Published October 12, 2012 | Version v1
Dataset Open

Data from: Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study

Description

Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine—TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a parallel, partially compositionally driven, synonymous codon-usage bias affects the nucleotide topology. As substitutions between serine codon families can proceed through threonine or cysteine intermediates, amino acid data sets might also be affected by the serine codon-usage bias. We suggest that a Dayhoff recoding strategy would partially ameliorate the effects of such bias. Although amino acids provide an alternative hypothesis of pancrustacean relationships, neither the nucleotides nor the amino acids version of this data set seems to bring enough genuine phylogenetic information to robustly resolve the relationships within group, which should still be considered unresolved.

Notes

Files

ser_rev_suppl.info.pdf

Files (3.6 MB)

Name Size Download all
md5:4c4bf108c9198e7fc2fa687ab4c3720b
1.0 MB Download
md5:749cc392a0e5fcbb4eba9fd5fb83791e
1.7 MB Download
md5:ad7bb8427adc388d249916e4612758c0
757.5 kB Preview Download

Additional details

Related works

Is cited by
10.1093/sysbio/sys077 (DOI)