Published June 17, 2025 | Version v3
Dataset Open

Leveraging evolution to identify novel organismal models of human biology

  • 1. ROR icon Arcadia Science

Contributors

  • 1. ROR icon Arcadia Science

Description

Contained are data associated with the publication "Leveraging evolution to identify novel organismal models of human biology". All data needed to replicate the analyses in the publication are provided, including input data/run configurations needed for performing phylogenetic inference via NovelTree, calculating molecular conservation for all human genes, and exploratory analyses.

Directories and files included:

  • run_configurations/noveltree-model-euks-samplesheet.csv - the samplesheet for our snakemake preprocessing workflow to filter and preprocess species proteomes prior to analysis with NovelTree.
  • run_configurations/euk_preprocess_samplesheet.tsv & run_configurations/noveltree-model-euks-parameterfile.json - the NovelTree sample and parameter files used to run NovelTree.
  • preprocessed_proteomes.tar.gz - a compressed tarball containing the preprocessed proteomes used by our NovelTree run.
  • results-noveltree-model-euks.tar.gz - a compressed tarball containing all outputs generated by our NovelTree run.
  • aa-summary-stats.tar.gz - a compressed tarball containing all AA summary statistics generated by code/genefam_aa_summaries.py.
  • gf-aa-multivar-distances.tar.gz - a compressed tarball containing all result files produced by code/calc_protein_mv_distances.R.
  • organismal_selection_tool_citations.csv- source citations describing available genetic perturbations for organisms in our portfolio.

Files

2024-organismal-selection-zenodo-v2.zip

Files (13.9 GB)

Name Size Download all
md5:a1cbf2512a72dc9d190093449236f2f7
13.9 GB Preview Download