Published February 20, 2024
| Version v1
Dataset
Open
10 Synthetic Genomics Datasets
Description
These are 10 synthetic genomics datasets generated with NEAT v3 (based on TP53 gene of Homo Sapiens) for the use case of benchmarking somatic variant callers. To find more about our generating framework please visit synth4bench GitHub repository.
The datasets explore intrinsic NGS data parameters for the use case of observing their effect on tumor-only somatic variant calling algorithms. From the 10 datasets, there are 5 of them with different coverage (while keeping all other parameters fixed) and 5 with varying read length. The reads in all datasets are paired-end .
Name of File | Coverage | Lenght of Reads |
300_30_10 | 300x | 150 |
700_70_10 | 700x | 150 |
1000_100_10 | 1000x | 150 |
3000_300_10 | 3000x | 150 |
5000_500_10 | 5000x | 150 |
1000_50 | 1000x | 50 |
1000_100 | 1000x | 100 |
1000_170 | 1000x | 170 |
1000_200 | 1000x | 200 |
1000_300 | 1000x | 300 |
Files
Files
(55.9 MB)
Name | Size | Download all |
---|---|---|
md5:c1d7f7638bfbf48c4e785b8605ad89ca
|
25.7 kB | Download |
md5:ca9ee4b2f062238c1f1d73915b186cbb
|
55.9 MB | Download |
Additional details
Software
- Programming language
- Python
- Development Status
- Active