New lineages provide insights into the convergent evolution of extreme salt adaptation within symbiotic Archaea
Authors/Creators
Description
Abstract
Environmental genomics has led to the discovery of many new lineages of archaea, including “DPANN” (or Nanobdellati), comprising organisms with small genomes, reduced gene content, and potentially symbiotic or parasitic lifestyles. DPANN live in various environments, and several lineages have been identified that are adapted to extremely high salt concentrations, including the Nanohaloarchaeota. Since it was long thought that the Haloarchaea (within ‘Euryarchaeota’) were the only salt-adapted archaea, the origins of these genome-reduced halophiles have been debated. Here we used phylogenetic, comparative genomic, and gene-tree/species-tree reconciliation approaches to resolve the evolution of halophily within DPANN, making use of recently-published genomes that help to inform the phylogenetic placement and genome evolution of salt-adapted lineages. Phylogenetic analysis placed Nanohaloarchaeota sister to a previously uncharacterised lineage, which we here refer to as Terrarchaeota. Terrarchaeota appear to be predominantly anaerobic thermophiles that are not adapted to high salt concentrations, indicating that salt adaptation evolved after their divergence from Nanohaloarchaeota. Furthermore, our analyses identified genomic hallmarks of salt adaptation in another recently discovered halophilic DPANN lineage within Aenigmatarchaeota, the Haloaenigmatarchaeaceae. We found that the Nanohaloarchaeota and Haloaenigmatarchaeaceae have distinct sets of proteins that enable life at high salt concentrations but share a common mechanism of evolutionary adaptation, in which niche-relevant genes were acquired horizontally from their halophilic hosts. This work provides the first detailed investigation into the enigmatic Terrarchaeota, and new insights into the convergent evolution of salt tolerance within symbiotic clades of Archaea.
Repository Contents
ALE includes all scripts and the complete workflow necessary to run the ALE analyses performed in this study as well as a folder containing the underlying treefiles and ALE output for the results described in the manuscript. Specifically this includes:
- The workflow used to perform ALE analyses
- Accessory scripts used as part of the ALE workflow
- Treefiles used for ALE reconciliation
- ALE output files for each individual gene family
Amino_Acid_Analysis includes the R workflow used to perform amino acid frequency analysis across the genome dataset used in the manuscript. This includes:
- The R script used for AA analyses
Annotation_Tables includes the scripts and input data necessary for generating the annotation tables presented in the manuscript. This includes:
- The script for generating the annotation count tables
- The script for generating gene presence/absence plots
- A set of files mapping annotation IDs to function used in the scripts
HGT_Analysis includes the scripts and mapping data for performing sisterhood frequency analysis as done in the manuscript. This includes:
- A python script for counting frequency of sisterhood across taxonomic ranks
- A R script for generating plots of the output from the python script
- Two files listing Family and Order level taxa in the dataset for use alongside the python script
Selection_of_Phylogenetic_Markers includes the workflow and scripts for marker gene selection and species tree reconstruction as done in the manuscript. This includes:
- The workflow for marker gene selection (Selecting_best_markers_151set.sh)
- A python script for identifying and removing long branches in trees (cut_gene_tree_v2.py)
- An R script for generating useful plots to compare marker gene suitability (TaxaCount_Stats_364taxa_revisions.R)
- A taxonomy mapping file for the dataset used in the study
- All marker gene alignments, single gene trees, concatenated phylogenies, and marker gene evaluation statistics produced by the R script in 3
Files
ALE.zip
Files
(14.9 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:00055e1094466c4801bc2f11ace0df7c
|
10.5 GB | Preview Download |
|
md5:dd86306e9eaf4aed4a9214364e40cb17
|
4.2 kB | Preview Download |
|
md5:fd7892ae306056b30b486c02b49485f7
|
156.1 MB | Preview Download |
|
md5:3e95707560900676c7ecdc6307c36975
|
14.0 kB | Preview Download |
|
md5:4caadffde3bb3d35da6d233ae91f1063
|
4.3 GB | Preview Download |