Simulated nucleotide sequences for testing alignment-free genome distance estimates
Description
This repository contains (12×500=)6,000 pairs of nucleotide sequences that have been simulated for testing alignment-free genome distance estimates, as described in Criscuolo (2019). Given an evolutionary distance d varying from 0.05 to 0.60 (step = 0.05), the program SeqGen was used to simulate the evolution of 500 nucleotide sequence pairs with d substitution events per character (GTR+Γ evolutionary model).
For each of the 12 evolutionary distances d = 0.05, 0.10, ..., 0.60, an XZ-compressed file containing 500 lines is available. Each line contains 18 fields separated by blank spaces:
[1] seed value used during simulation,
[2] true evolutionary distance d between the two simulated sequences,
[3] total number of simulated characters,
[4] number of non-indel characters with nucleotide mismatch,
[5] number of non-indel characters,
[6-9] A, C, G, T frequencies used during simulation,
[10-15] GTR parameters used during simulation,
[16] Γ distribution parameter used during simulation,
[17-18] two simulated sequences with indel events as gaps.
Of note, each pair of aligned sequences without gaps can be regenerated using SeqGen v1.3.4 with parameters from fields [1,3,6-16] and the following two-leaf model tree:
(t1:d,t2:0.000);
where d is given in field [2].
___
Criscuolo A (2019) A fast alignment-free bioinformatics procedure to infer accurate distance-based phylogenetic trees from genome assemblies. Research Ideas and Outcomes, 5:e36178. doi:10.3897/rio.5.e36178
Files
Files
(13.4 GB)
Name | Size | Download all |
---|---|---|
md5:0ddb2bfc2653791063e9d9d0b8cc3d30
|
843.7 MB | Download |
md5:3f755ef8d6150b360f6b4d1cccb94f26
|
907.9 MB | Download |
md5:a03606bff176c462a2b971db9415bcbf
|
984.9 MB | Download |
md5:74dfaff1a72433bab0da79a95681074c
|
1.0 GB | Download |
md5:12f2a148ae3cd14a0a07662b6d323634
|
1.1 GB | Download |
md5:053ee087d17d772de93c050b9e69990e
|
1.1 GB | Download |
md5:b2634cee2bc4b6cdcf9f8045ec980d00
|
1.2 GB | Download |
md5:4d5725c17bcee62621d3dcbc689e6934
|
1.2 GB | Download |
md5:2cbef5bbf476710a17b92223d5084264
|
1.3 GB | Download |
md5:9872b16b7dfb49b6c276d8fcc9d24c21
|
1.3 GB | Download |
md5:ce2a236c41a138336aff771ed5ae7e50
|
1.2 GB | Download |
md5:70e9ff207c46f7418327db3eb9411a24
|
1.2 GB | Download |