Published November 9, 2023 | Version v3
Dataset Open

Supplementary Information BeXY

  • 1. University of Fribourg

Description

These are all the input files that were used in the publication.

S1 Table: contains a list of all samples, their respective publication and ENA project number analyzed in this paper (there are 4 sheets for whole-genome sequencing of ancient humans (WGS), high-depth subset of ancient WGS, 1240k target enrichment capture samples of ancient humans, and modern human WGS from the Simons Genome Diversity Project).

Capture1240k: contains the counts and the number of targets per scaffold for all ancient humans sequenced with 1240k target enrichment capture sequencing (see S1 table)

downsampling_aneuploids: contains all downsampled counts for simulated aneuploid invididuals as well as scripts to generate those. Also contains our implementation of the method seGMM.

downsampling_trisomy: contains all downsampled counts for simulated invididuals with trisomy 21 (factor21_1.5) and without trisomy 21 (factor_21_1) as well as scripts to generate those.

downsampling_euploids: contains all downsampled counts for euploid individuals as well as scripts to generate those.

lowQualityReference: contains all downsampled counts for the simulated low-quality reference genome assembly, as well as the scripts that were used to generate such an assembly based on the human reference genome.

WGS_ancient: contains the counts and the chromosome lengths for all ancient humans sequenced with whole-genome shotgun sequencing (see S1 table)

WGS_ancient_samples_lt_2x: contains the counts and the chromosome lengths for the 116 high-depth WGS samples (see S1 table)

WGS_modern_SGDP: contains the counts and the chromosome lengths for the 276 modern human samples downloaded from the Simons Genome Diversity Project (see S1 table). wgs_modern_SGDP_original_counts_withoutSeqTypes.txt corresponds to the counts obtained from the downloaded CRAM files, without considering differences in sequencing types. wgs_modern_SGDP_original_counts_withSeqTypes.txt corresponds to the same counts, but the sequencing type per sample is specified in the second column. wgs_modern_SGDP_filteredMQ30_counts.txt corresponds to the counts obtained by filtering on a mapping quality of 30.

non_model_organisms_posterior_probabilities_t: contains the posterior probabilities for each scaffold to be autosomal, Y-linked, X-linked or different as inferred by BeXY, for each of the six non-model organism species published in Nursyifa et al. 2021 ( https://doi.org/10.1111/1755-0998.13491).

 

Files

Capture1240k.zip

Files (109.9 MB)

Name Size Download all
md5:05ee862be04c5adfba30771690b26d5a
184.3 kB Preview Download
md5:45b0fae4e7789a85185af5ac4183bc56
18.9 MB Preview Download
md5:a8bbbe4ef562623c6777715dfd6811b7
10.9 MB Preview Download
md5:d002f5defdf7e4c1853b7682cd9bc4f0
5.9 MB Preview Download
md5:72717fd5283d11f3285dfa3d59d7150f
73.4 MB Preview Download
md5:35cc6dbb3309f0248501e30aa8b3f2e9
36.1 kB Preview Download
md5:c97c2d1abf0867c8b2393f6019efff9e
406.5 kB Download
md5:d4fcbc4dfdf2456a227901f423d454a9
89.2 kB Preview Download
md5:adaf839ee69cd799663831aaf4c5dc4d
13.8 kB Preview Download
md5:61b91555106787e43ed282388c869181
98.9 kB Preview Download