Planned intervention: On Thursday March 28th 07:00 UTC Zenodo will be unavailable for up to 5 minutes to perform a database upgrade.
Published September 10, 2021 | Version v1
Dataset Open

Supporting data for: Gene-rich UV sex chromosomes harbor conserved regulators of sexual development (Carey et al., 2021)

  • 1. University of Florida
  • 2. Joint Genome Institute
  • 3. HudsonAlpha Institute for Biotechnology
  • 4. University of Paris-Saclay
  • 5. Duke University
  • 6. Philipp University of Marburg
  • 7. Clemson University
  • 8. RAPiD Genomics*
  • 9. Georgia Institute of Technology
  • 10. Center for International Forestry Research
  • 11. University of Turku
  • 12. Cornell University
  • 13. Chicago Botanic Garden*
  • 14. Texas Tech University

Description

Non-recombining sex chromosomes, like the mammalian Y, often lose genes and accumulate transposable elements, a process termed degeneration. The correlation between suppressed recombination and degeneration is clear in animal XY systems, but the absence of recombination is confounded with other asymmetries between the X and Y. In contrast, UV sex chromosomes, like those found in bryophytes, experience symmetrical population genetic conditions. Here we generate and use nearly gapless female and male chromosome-scale reference genomes of the moss Ceratodon purpureus to test for degeneration in the bryophyte UV sex chromosome system. We show the moss sex chromosomes evolved over 300 million years ago and expanded via two chromosomal fusions. Although the sex chromosomes show signs of weaker purifying selection than autosomes, we find suppressed recombination alone is insufficient to drive gene loss on sex-specific chromosomes. Instead, the U and V sex chromosomes harbor thousands of broadly-expressed genes, including numerous key regulators of sexual development across land plants.

Notes

  • novaseq_FASTQ_de_interlacer.pl -- splits paired-end Illumina NovaSeq data into forward and reverse files
  • liverwort_trinity_assemblies.tar.gz -- contains all de novo Trinity assemblies for liverworts used in this study
  • moss_trinity_assemblies.tar.gz -- contains all de novo Trinity assemblies for mosses used in this study
  • all_pep_files_for_orthofinder.tar.gz -- all peptide files for all species used in the OrthoFinder run in this study
  • Orthogroups.txt - all orthogroups identified by OrthoFinder clustering
  • orthogroup_filter.pl -- perl script to filter orthogroups ("clusters") output by OrthoFinder for a minimum number of species
  • all_cds.fa.gz and all_pep.fa.gz -- fasta files containing all cds and peptides, respectively, for all species combined to write fasta files for each Orthofinder gene cluster
  • fasta_from_OrthoFinder.pl -- perl script to write a separate fasta file for each Orthogroup ("cluster") output by OrthoFinder
  • alignment_length_filter.pl -- perl script to filter fasta files by a user input minimum number of nucleotides or amino acids
  • sexlinked_liverwort_alignments.tar.gz -- final, filtered cds alignments used to build gene trees of sex-linked genes in Marchantia polymorpha
  • sexlinked_moss_alignments.tar.gz -- final, filtered cds alignments used to build gene trees of sex-linked genes in Ceratodon purpureus
  • sexlinked_liverwort_trees.tar.gz -- RAxML gene trees with bootstrap support of sex-linked genes in Marchantia polymorpha
  • sexlinked_moss_trees.tar.gz -- RAxML gene trees with bootstrap support of sex-linked genes in Ceratodon purpureus
  • edlwtre2.pl -- perl script that roots gene trees and reduces isoforms of the same sample (within a clade) down to the longest isoform
  • physco_outgroup.py -- python script that uses ETE3 to identify C. purpureus sex-linked genes and the closest Physcomitrium patens outgroup
  • prune_tree.py -- python script that uses ETE3 to identify C. purpureus sex-linked genes and prune at the closest Physcomitrium patens outgroup. The script also randomly selects one isoform/homolog for each other species in the tree
  • array_hash_extractor_fasta_unlock_tree_mod.pl -- perl script that filters the original fasta file for those left after prune_tree.py
  • paml_header_prep.pl -- perl script for prepping the headers in gene trees and fasta files for PAML
  • paml_tree_prep.pl -- perl script for generating different labeled trees for the sex-linked genes evolving differently than autosomes for PAML
  • paml_bash.sh -- bash script to run PAML on multiple genes and report the results of dN, dS, and dN/dS for C. purpureus sex-linked genes
  • paml_AIC.pl -- perl script necessary to run PAML in paml_bash.sh
  • array_hash_extractor_fasta_unlock_ks.pl -- perl script that searches for a user identified list of C. purpureus one-to-one orthologous UV genes across multiple alignments. The output is an individual alignment for each of the U and V-linked orthologous genes
  • aln_to_axt.pl -- perl script that converts an alignment of one-to-one UV genes into axt format for KaKs Calculator
  • ceratodon_genome_plots.R -- R script for generating gene tree plots, density plots, Ks on UV chromosome plot,  codon metrics and dN/dS plots, and gene expression heatmaps

Files

Orthogroups.txt

Files (3.6 GB)

Name Size Download all
md5:b3b1820705470af2903387aaeeee11bb
537.0 MB Download
md5:55e757a1600efb72a8968db4f6d4da91
313.5 MB Download
md5:fe0e8c5f1e2a5a28ce749cfdc49dfbb5
313.5 MB Download
md5:e42d61e22250dd1f7736a64557b69486
656.4 MB Download
md5:f4d8e73cf64384c3c7e76010ab30bdb3
1.6 GB Download
md5:2ed179c62a26ded0039221ce7901a5a8
97.0 MB Preview Download
md5:186bd9d32dc006c0f8e071f131f10031
2.8 MB Download
md5:2fd8caeff6a305658722646eb4f3fe42
422.6 kB Download
md5:e72512aa16a43c18458cd7f593bbc8c8
27.6 MB Download
md5:3c3d9a6e9d1f8afff34821b7096e3fff
4.6 MB Download

Additional details

Related works

Is derived from
10.5281/zenodo.5385170 (DOI)