There is a newer version of the record available.

Published March 18, 2025 | Version v1
Publication Open

Beyond White-Nose Syndrome: A Multi-Scale Genomic Analysis of Pseudogymnoascus destructans

Description

Abstract

White-Nose Syndrome (WNS) has devastated insectivorous bat populations, particularly in North America, leading to severe ecological and economic consequences. Despite extensive research, many aspects of the evolutionary history, mitochondrial genome organization, and metabolic adaptations of its etiological agent, Pseudogymnoascus destructans, remain unexplored. Here, we present a multi-scale genomic analysis integrating pangenome reconstruction, phylogenetic inference, Bayesian divergence dating, comparative mitochondrial genomics, and refined functional annotation. We show that P. destructans exhibits extensive mitochondrial genome rearrangements absent in its nonpathogenic relatives from the Leotiomycetesclass, suggesting a potential link between mitochondrial evolution and pathogenic adaptation. Our divergence dating analysis reveals that P. destructans separated from its Antarctic relatives approximately 141 million years ago, before adapting to bat hibernacula in the Northern Hemisphere. Additionally, our refined functional annotation significantly expands the known functional landscape of P. destructans, revealing an extensive repertoire of previously uncharacterized proteins involved in carbohydrate metabolism and secondary metabolite biosynthesis – key processes that likely contribute to its pathogenic success. By providing new insights into the genomic basis of P. destructans adaptation and pathogenicity, our study refines the evolutionary framework of this fungal pathogen and creates the foundation for future research on WNS mitigation strategies.

01_PanPhylo_analysis/

This directory contains all the files generated and analysed during the pangenome and phylogenomic analyses:

  1. pangenome — data from pangenome analysis:
    1. data — directory with the list of accession numbers of mitochondrial genomes to be analysed 
    2. Annotation — pre-annotated mithocondrial genomes from RefSeq database:
      1. Genes — directory with .fasta files of nucleotide sequences
      2. Proteins_classic — directory with .fasta files of amino-acid sequences
      3. Proteins — .fasta files with renamed aa seqs
      4. LSINFO-.lst — list file for input in PanACoTA
      5. fLSTINFO-.lst — filtered list file for extracting the shell pangenome
    3. Pangenome — PanACoTA's build pangenome with strict protein identity parameter (i = 0.9)
    4. Coregenome — extracted shell genome (proteins persistent in 2/3 of analysed genomes)
    5. Alignment — PanACoTA's align (MAFFT) module output to extract the sequences of shell genome
      1. ... — a lot of log files
      2. MSAs — renamed MSAs to understand which gene family means what
      3. trimmed_MSAs — trimAl's trimmed MSAs
    6. model-finder — ModelFinder log files on all the trimmed MSAs
    7. tree — final phylogenies constructed using the best substitution model on all the trimmed MSAs
  2. phylogenomics — data from phylogenomics analysis:
    1. Proteins_renamed; Proteins_renamed_r2; Proteins_renamed_r3 — directories with .fasta files of amino-acid sequences with several rounds of renaming process to make them fit Proteinortho input requirements
    2. protein_ortho_output — directory with all the output files of Proteinortho
    3. All; All_names — directories with technical data used to extract SCOs
    4. all_pep.fa.fasta file with all the mitochondrial proteomes combined used to extract SCOs
    5. All_seqs; All_seqs_renamed — directories with .fasta files of SCOs
    6. MSAs — renamed MSAs to understand which gene family means what
    7. trimmed_MSAs — trimAl's trimmed MSAs
    8. model-finder — ModelFinder log files on concatenated trimmed MSAs
    9. tree — final phylogenies constructed using the best substitution model on concatenated trimmed MSAs
  3. metadata — directory with the GenBank's metadata on analysed mitochodrial genomes fetched with Phyloki:
    1. raw_metadata.tsv — Phyloki's first results
    2. metadata.tsv — data with filtered Year column

02_Comparative_genomics/

This directory contains all the files generated and analysed during the comparative genomic analysis:

  1. data — directory with analysed genomes both in .fasta and .gb formats
  2. ANI — all the Average Nucleotide Identity analysis data:
    1. querylist.txt; reflist.txt — FastANI's inputs
    2. fastani.out; fastani.out.matrix — FastANI's outputs
  3. ANI.csv — data from the ANI heatmap

03_Dating/

This directory contains all the files generated and analysed during the Bayesian evolutionary analysis:

  1. data — directory with all the data generated by analysis:
    1. dating_super_tree.xml — BEAUti's generated BEAST file
    2. dating_super_tree.treesdating_super_tree.opsdating_super_tree.log — BEAST outputs
    3. dating_super_tree.tree — TreeAnnotator's annotated tree
    4. dating_super_tree_ready.tree — tree ready for visualization
  2. screenshots — screenshots of GUIs applications parameters set prior to running the analysis

04_Functional_annotation/

This directory contains all the files generated and analysed during the functional annotation analysis:

  1. data — directory with the initial files to be analysed:
    1. characterized.fasta — all the sequences available in RefSeq database by 
      '"Pseudogymnoascus destructans" AND Fungi NOT "uncharacterized" AND srcdb_refseq[PROP]'
       query
    2. uncharacterized.fasta — all the sequences available in RefSeq database by 
      '"Pseudogymnoascus destructans" AND Fungi AND "uncharacterized" AND srcdb_refseq[PROP]' query
    3. complete.fasta — merged .fasta file (characterized + uncharacterized)
  2. eggNOG — eggNOG-mapper annotations on all three profiles:
    1. characterized — annotations on characterized.fasta file
      1. characterized.emapper.annotations — main eggNOG-mapper's annotation file
      2. clean_characterized.emapper.annotations — eggNOG-mapper annotation file with removed duplicated
      3. characterized.emapper.seed_orthologs; characterized.emapper.genepred.fasta; characterized.emapper.genepred.gff; characterized.emapper.hits — other eggNOG-mapper annotation files
      4. characterized_cog_category_counts.tsv — count file with COG categories
      5. characterized_cog_category_counts_clean.tsv — processed count file with COG categories where multi-letter COG categories are treated like single-letter categories based on the 1st letter (e.g. KTN -> K)
    2. uncharacterized — annotations on uncharacterized.fasta file:
      1. Same as characterized
    3. complete — annotations on complete.fasta file:
      1. Same as characterized
  3. KEGGaNOG_data — data generated from running KEGGaNOG on characterized.emapper.annotations; uncharacterized.emapper.annotations & complete.emapper.annotations (this data was generated just for fun, it is not mentioned in the paper and the description will not be provided)

Files

01_PanPhylo_analysis.zip

Files (154.8 MB)

Name Size Download all
md5:dc948a8cea5b2d7d176567a6f9242d36
3.8 MB Preview Download
md5:2e85a30007217a2ca6cc0df76a593d52
167.1 kB Preview Download
md5:48ed84097fd935519149e254668c3da2
6.2 MB Preview Download
md5:bf4794bf76082b7c0b9d494a1210e6f5
144.6 MB Preview Download

Additional details

Dates

Available
2025-03-18
First published version