Published January 10, 2025 | Version v1
Dataset Open

New lineages provide insights into the convergent evolution of extreme salt adaptation within symbiotic Archaea

  • 1. Koninklijk Nederlands Instituut voor Onderzoek der Zee Afdeling Mariene Microbiologie en Biogeochemie
  • 2. ROR icon University of Amsterdam
  • 3. ROR icon Monash University
  • 4. ROR icon University of Bristol
  • 5. ROR icon Royal Netherlands Institute for Sea Research

Description

Abstract

Environmental genomics has led to the discovery of many new lineages of archaea, including “DPANN” (or Nanobdellati), comprising organisms with small genomes, reduced gene content, and potentially symbiotic or parasitic lifestyles. DPANN live in various environments, and several lineages have been identified that are adapted to extremely high salt concentrations, including the Nanohaloarchaeota. Since it was long thought that the Haloarchaea (within ‘Euryarchaeota’) were the only salt-adapted archaea, the origins of these genome-reduced halophiles have been debated. Here we used phylogenetic, comparative genomic, and gene-tree/species-tree reconciliation approaches to resolve the evolution of halophily within DPANN, making use of recently-published genomes that help to inform the phylogenetic placement and genome evolution of salt-adapted lineages. Phylogenetic analysis placed Nanohaloarchaeota sister to a previously uncharacterised lineage, which we here refer to as Terrarchaeota. Terrarchaeota appear to be predominantly anaerobic thermophiles that are not adapted to high salt concentrations, indicating that salt adaptation evolved after their divergence from Nanohaloarchaeota. Furthermore, our analyses identified genomic hallmarks of salt adaptation in another recently discovered halophilic DPANN lineage within Aenigmatarchaeota, the Haloaenigmatarchaeaceae. We found that the Nanohaloarchaeota and Haloaenigmatarchaeaceae have distinct sets of proteins that enable life at high salt concentrations but share a common mechanism of evolutionary adaptation, in which niche-relevant genes were acquired horizontally from their halophilic hosts. This work provides the first detailed investigation into the enigmatic Terrarchaeota, and new insights into the convergent evolution of salt tolerance within symbiotic clades of Archaea.

 

 

Repository Contents

ALE includes all scripts and the complete workflow necessary to run the ALE analyses performed in this study as well as a folder containing the underlying treefiles and ALE output for the results described in the manuscript. Specifically this includes:

  1. The workflow used to perform ALE analyses
  2. Accessory scripts used as part of the ALE workflow
  3. Treefiles used for ALE reconciliation
  4. ALE output files for each individual gene family

Amino_Acid_Analysis includes the R workflow used to perform amino acid frequency analysis across the genome dataset used in the manuscript. This includes:

  1. The R script used for AA analyses

Annotation_Tables includes the scripts and input data necessary for generating the annotation tables presented in the manuscript. This includes:

  1. The script for generating the annotation count tables
  2. The script for generating gene presence/absence plots
  3. A set of files mapping annotation IDs to function used in the scripts

HGT_Analysis includes the scripts and mapping data for performing sisterhood frequency analysis as done in the manuscript. This includes:

  1. A python script for counting frequency of sisterhood across taxonomic ranks
  2. A R script for generating plots of the output from the python script
  3. Two files listing Family and Order level taxa in the dataset for use alongside the python script

Selection_of_Phylogenetic_Markers includes the workflow and scripts for marker gene selection and species tree reconstruction as done in the manuscript. This includes:

  1. The workflow for marker gene selection (Selecting_best_markers_151set.sh)
  2. A python script for identifying and removing long branches in trees (cut_gene_tree_v2.py)
  3. An R script for generating useful plots to compare marker gene suitability (TaxaCount_Stats_364taxa_revisions.R)
  4. A taxonomy mapping file for the dataset used in the study
  5. All marker gene alignments, single gene trees, concatenated phylogenies, and marker gene evaluation statistics produced by the R script in 3

 

Files

ALE.zip

Files (14.9 GB)

Name Size Download all
md5:00055e1094466c4801bc2f11ace0df7c
10.5 GB Preview Download
md5:dd86306e9eaf4aed4a9214364e40cb17
4.2 kB Preview Download
md5:fd7892ae306056b30b486c02b49485f7
156.1 MB Preview Download
md5:3e95707560900676c7ecdc6307c36975
14.0 kB Preview Download
md5:4caadffde3bb3d35da6d233ae91f1063
4.3 GB Preview Download