Data from: Scoutknife: A naïve, whole genome informed phylogenetic robusticity metric

Fleming, James; Eriksen, Pia Merete; Struck, Torsten Hugo

doi:10.5281/zenodo.8160834

Published July 18, 2023 | Version v1

Software Open

Data from: Scoutknife: A naïve, whole genome informed phylogenetic robusticity metric

1. University of Oslo

The phylogenetic bootstrap, first proposed by Felsenstein in 1985, is a critically important statistical method in assessing the robusticity of phylogenetic datasets. Core to its concept was the use of pseudosampling - assessing the data by generating new replicates derived from the initial dataset that was used to generate the phylogeny. In this way, phylogenetic support metrics could overcome the lack of perfect, infinite data. With infinite data, however, it is possible to sample smaller replicates directly from the data to obtain both the phylogeny and its statistical robusticity in the same analysis. Due to the growth of whole genome sequencing, the depth and breadth of our datasets have greatly expanded and are set to only expand further. With genome-scale datasets comprising thousands of genes, we can now obtain a proxy for infinite data. Accordingly, we can potentially abandon the notion of pseudosampling and instead randomly sample small subsets of genes from the thousands of genes in our analyses. Here, we introduce Scoutknife, a jackknife-style subsampling implementation that generates 100 datasets by randomly sampling a small number of genes from an initial large-gene dataset to jointly establish both a phylogenetic hypothesis and assess its robusticity. Using 18 previously published datasets and 100 simulation studies, we show that Scoutknife is conservative and informative as to conflicts and incongruence across the whole genome, without the need for subsampling based on traditional model selection criteria.

Notes

Data Files can all be opened in any text editor.

Supplementary files in the Tables category are in Excel format, and can be opened by LibreOffice

Scoutknife is available at https://github.com/JFFleming/Scoutknife

Funding provided by: Norges Forskningsråd
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100005416
Award Number: 300587

Files

Scoutknife-main.zip

Files (8.8 kB)

Name	Size	Download all
Scoutknife-main.zip md5:08757c8466e029cc4a1d87d275c74d59	8.8 kB	Preview Download

Additional details

Is source of: 10.5061/dryad.sxksn0383 (DOI)

	All versions	This version
Views	131	131
Downloads	22	22
Data volume	194.0 kB	194.0 kB

Data from: Scoutknife: A naïve, whole genome informed phylogenetic robusticity metric

Authors/Creators

Description

Notes

Files

Scoutknife-main.zip

Files (8.8 kB)

Additional details

Related works