Phylogenomic analyses reveal that Panguiarchaeum is a clade of genome-reduced Asgard archaea
Authors/Creators
Description
Abstract
The Asgard archaea are a diverse archaeal phylum that includes the host lineage from which eukaryotes evolved. Due to the importance of the Asgard archaea for our understanding of cellular evolution and eukaryotic complexity, recent efforts have focused on better characterizing the genomic diversity of this lineage and its relatives from the TACK archaea). Newly discovered lineages, the Njordarchaeales and Panguiarchaeales, are difficult to place, branching either with Korarchaeota or Asgard archaea. The phylogenetic position of these taxa is important because they include genome-reduced and potentially symbiotic lineages, which may help to inform our understanding of genome evolution and the evolution of symbiotic interactions within Asgard archaea and their relatives. To resolve the placement of Njordarchaeales and Panguiarchaeales in the archaeal tree, we performed a range of phylogenetic analyses revealing that the Njordarchaeales and Pangiuarchaeales constitute the newly-identified class Njordarchaeia within the Asgard archaea. Members of the Njordarchaeia exhibit hallmarks of adaptations to (hyper-) thermophilic lifestyles, which may contribute conflicting signals to phylogenetic analyses. Panguiarchaeum appears to be metabolically distinct from its relatives, displaying reduced metabolic potential and various auxotrophies. Gene tree-species tree reconciliation recovers a complex common ancestor of Asgard archaea that encoded the Wood Ljungdahl pathway, however the subsequent loss of this pathway during the reductive evolution of Panguiarchaeum may have been associated with the switch to a symbiotic lifestyle based on H2 syntrophy. Our analyses identify Panguiarchaeum as the first described Asgard archaea lineage with streamlined genomes that show indications for a symbiotic lifestyle.
Table of contents
Repository Contents
1_Genome_files.tar.gz:
- Folder 'faa': this folder contains all protein sequence files for 966 archaeal, 1325 bacteria, and 137 eukaryotic MAGs/genomes/largely complete transcriptomes used in this study.
- Folder 'annotations': this folder contains the protein annotation results for the protein sequence described above.
2_phylogenies.tar.gz
- 1_species_trees: this folder contains all alignments and tree files for the species tree inference using different datasets and single gene tree files used for gene tree inspection and ranking. Files are organised as follows and are associated with the corresponding parts of the manuscript: Figure 1, Supplementary Figures 1, 5-7, 10-15.
-
- Folder '1_single_gene_tree_inspection' includes all treefiles and pdf for single gene tree inspection.
- Folder '2_ranking' includes all treefiles and split scores from marker ranking.
- Folder '3_concatenation' includes, unaligned sequences (under subfolder: .faa unaligned) untrimmed alignment files (under subfolder mafft), trimmed alignment files (under subfolder bmge), site likelihood files (.sitelh, under subfolder trees_sitelh) and treefiles (constrained and unconstrained trees_sitelh/treefiles) inferred under different models (See Methods) based on different datasets (subdirectories: 966 taxa, 303 taxa and 71 taxa)
- 2_single_gene_trees_ESP_and_other: this folder contains unaligned sequences, untrimmed alignments, trimmed alignment files and treefiles for single gene tree inference of ESP proteins, and other single gene trees. Files are organised as individual folders for one single gene tree phylogeny and are associated with the corresponding parts of the manuscript: Supplementary Figures 2-4, 18-23 and 31-36. For detail commands, refers to 3_workflow_scripts/2_phylogeneitc_analyses.
3_workflow_scripts.tar.gz
- 1_workflows: this folder includes workflows (.qmd) used in this study.
- 1_marker_inspection.qmd: this document includes code to generate files for marker gene inspection.
- 2_phylogenetic_analyses.qmd: this document includes commands and examples for phylogenetic inference.
- 3_annotations_workflow.qmd: this document includes the workflow for annotating protein sequences against different databases, analysis of gene presence and absence profiles, and amino acid composition.
- 4_ALE_workflow.qmd: this document includes the workflow for reconciliation analyses.
- *2_scripts: this folder contains Python and bash scripts used in 1_workflows, and example scripts for phylogenetic tree visualisation.
4_ALE.tar.gz:
- reconcilliations-TableEvents_clean.7z: This file summarises the reconciliation results using default parameters.
- OR_recon-TableEvents_clean.7z: This file summarises the reconciliation results based on per-arCOG-category optimised origination rate (See Methods).
- SpeciesTreeRef.newick: species tree used in the reconciliation, with internal node ids corresponding in the files described above.
Notes
Files
Files
(2.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:fcc7392bf18c10f3da050ed871996282
|
2.5 GB | Download |
|
md5:adf24d7e77ffff345eeffe51afb18647
|
243.4 MB | Download |
|
md5:2fbe65335739abb8aefd0dd2a393b8e4
|
88.9 kB | Download |
|
md5:2b351a96c020efb08769cc654bfd3784
|
80.8 MB | Download |
Additional details
Dates
- Created
-
2025-01-31