Published December 12, 2024
| Version v1
Dataset
Open
Identifying organismal models of human biology de novo
Description
Contained are data associated with the publication "Identifying organismal models of human biology de novo". All data needed to replicate the analyses in the publication are provided, including input data/run configurations needed for performing phylogenetic inference via NovelTree, calculating molecular conservation for all human genes, and exploratory analyses.
Directories and files included:
run_configurations/noveltree-model-euks-samplesheet.csv- the samplesheet for our snakemake preprocessing workflow to filter and preprocess species proteomes prior to analysis with NovelTree.run_configurations/euk_preprocess_samplesheet.tsv&run_configurations/noveltree-model-euks-parameterfile.json- the NovelTree sample and parameter files used to run NovelTree.preprocessed_proteomes.tar.gz- a compressed tarball containing the preprocessed proteomes used by our NovelTree run.results-noveltree-model-euks.tar.gz- a compressed tarball containing all outputs generated by our NovelTree run.aa-summary-stats.tar.gz- a compressed tarball containing all AA summary statistics generated bycode/genefam_aa_summaries.py.gf-aa-multivar-distances.tar.gz- a compressed tarball containing all result files produced bycode/calc_protein_mv_distances.R.organismal_selection_tool_citations.csv- source citations describing available genetic perturbations for organisms in our portfolio.
Files
Files
(11.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:58e53f0ff7ffd3b2a162bc8155c5d976
|
11.4 GB | Download |