Dataset Open Access

Data from: Viral tagging reveals discrete populations in Synechococcus viral genome sequence space

Deng, Li; Ignacio-Espinoza, J. Cesar; Gregory, Ann C.; Poulos, Bonnie T.; Weitz, Joshua S.; Hugenholtz, Philip; Sullivan, Matthew B.

Microbes and their viruses drive myriad processes across ecosystems ranging from oceans and soils to bioreactors and humans. Despite this importance, microbial diversity is only now being mapped at scales relevant to nature, while the viral diversity associated with any particular host remains little researched. Here we quantify host-associated viral diversity using viral-tagged metagenomics, which links viruses to specific host cells for high-throughput screening and sequencing. In a single experiment, we screened 107 Pacific Ocean viruses against a single strain of Synechococcus and found that naturally occurring cyanophage genome sequence space is statistically clustered into discrete populations. These population-based, host-linked viral ecological data suggest that, for this single host and seawater sample alone, there are at least 26 double-stranded DNA viral populations with estimated relative abundances ranging from 0.06 to 18.2%. These populations include previously cultivated cyanophage and new viral types missed by decades of isolate-based studies. Nucleotide identities of homologous genes mostly varied by less than 1% within populations, even in hypervariable genome regions, and by 42–71% between populations, which provides benchmarks for viral metagenomics and genome-based viral species definitions. Together these findings showcase a new approach to viral ecology that quantitatively links objectively defined environmental viral populations, and their genomes, to their hosts.
Files (3.6 GB)
Name Size
ANI_2_PCA.txt
md5:f85b8b8f9226a1b26df4dc9ddf39f0d3
351.3 kB Download
Comm_MG.fna
md5:fa08df3d4c7b497acbc1c7d2fdc73125
53.8 MB Download
ConsensusCGs.zip
md5:a769b4912cfc6016b59975eb95aa2c25
1.4 MB Download
DATA-FIGURES_Replace.xls
md5:a9ac2730881f1f5beb5016d75c4e496c
196.1 kB Download
GP23_Sequences.txt
md5:e4a18060ed8b5fdfc79df38c4b420b5a
13.5 kB Download
RandomizationsX1500.FNA
md5:1fb11a47c02d2e37bd6de795ee47915f
136.7 MB Download
RAREFACTION.zip
md5:bd63da8efa04c8a2056ad52ecf73a48b
3.5 MB Download
VT_MG.fna
md5:39e5a1546d62148115d9ef5466b7fe0c
40.1 MB Download
VT_MG_IL.fastq
md5:f45c3654b26adf245581f7be37288eaa
3.4 GB Download
14
8
views
downloads
Views 14
Downloads 8
Data volume 3.6 GB
Unique views 14
Unique downloads 6

Share

Cite as