Dataset Open Access

Simulated wastewater sequencing data for benchmarking SARS-CoV-2 variant abundance estimation

Baaijens, Jasmijn A.; Zulli, Alessandro; Ott, Isabel M.; Petrone, Mary E.; Alpert, Tara; Fauver, Joseph R.; Kalinich, Chaney C.; Vogels, Chantal B.F.; Breban, Mallery I.; Duvallet, Claire; McElroy, Kyle; Ghaeli, Newsha; Imakaev, Maxim; Mckenzie-Bennett, Malaika; Robison, Keith; Plocik, Alex; Schilling, Rebecca; Pierson, Martha; Littlefield, Rebecca; Spencer, Michelle; Simen, Birgitte B.; Yale SARS-CoV-2 Genomic Surveillance Initiative; Hanage, William P.; Grubaugh, Nathan D.; Peccia, Jordan; Baym, Michael

To evaluate the accuracy of variant abundance predictions from wastewater sequencing, we built a collection of benchmarking datasets that resemble real wastewater samples. For each variant (B.1.1.7, B.1.351, B.1.427, B.1.429, P.1) we created a series of 33 benchmarks by simulating sequencing reads from a variant genome, as well as a collection of background (non-variant of concern/interest) sequences, such that the variant abundance ranges from 0.05% to 100%. Analogously, we created a second series of benchmarks, simulating reads only from the Spike gene of each SARS-CoV-2 genome. We refer to the first set of benchmarks as "whole genome" (WG) and to the second set of benchmarks as "S-only". We repeated these simulations at different sequencing depths: 100x and 1000x coverage for the whole genome benchmarks, and 100x, 1000x, and 10,000x coverage for the S-only benchmarks.

Files (7.8 GB)
Name Size
S-only-10000x.tar.gz
md5:8ae9a37da2a6d7b6b07e04359b1a19ad
4.0 GB Download
S-only-1000x.tar.gz
md5:d307294a92d25102a5cc70341ac89c16
391.2 MB Download
S-only-100x.tar.gz
md5:58bdd104f63668a56831ddc6f76632a1
38.8 MB Download
WG-1000x.tar.gz
md5:72e6236d04fba162c8b6f246efc9a52b
3.1 GB Download
WG-100x.tar.gz
md5:56b7bb7e7a16e155cc17b1fb320a64c5
307.5 MB Download
432
43
views
downloads
All versions This version
Views 432432
Downloads 4343
Data volume 78.5 GB78.5 GB
Unique views 417417
Unique downloads 2525

Share

Cite as