Viral contigs and RPKM abundance tables from Tempo and Timing: Synchronization between viral and host activity in hot spring phototrophic mats over a diel cycle
Authors/Creators
Description
This dataset contains all viral contigs and viral operational taxonomic units (vOTUs) identified from metagenomic and metatranscriptomic datasets analyzed in this study, along with their corresponding abundance estimates across samples.
Viral sequences were identified using the MVP pipeline (v1.0.3), which integrates geNomad (v1.7.6) for virus detection, followed by quality and completeness assessment with CheckV (v1.0.1). Conservative filtering criteria were applied based on virus score, genome length, and the presence of viral hallmark genes to ensure high-confidence viral identification. Both DNA and RNA viruses were included.
Viral contigs were subsequently binned into viral bins (vBins) using vRhyme (v1.1.0), and remaining sequences were retained as unbinned viral contigs (vContigs). All sequences were clustered into species-level viral operational taxonomic units (vOTUs) using average nucleotide identity (ANI ≥ 95%) and alignment fraction (AF ≥ 85%), with the longest sequence selected as the representative for each vOTU.
Abundance of vOTUs was estimated by mapping metagenomic reads to representative sequences using Bowtie2, followed by coverage estimation with CoverM (v0.7.0). Reads Per Kilobase per Million mapped reads (RPKM) values were calculated for each vOTU and sample, with values retained only when at least 10% of the contig length was covered by mapped reads.
Files
Files
(41.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:86a9246661cfd99a246a5770d0188da2
|
39.3 MB | Download |
|
md5:e7166ae8f26a162479e094ba8c07c025
|
2.3 MB | Download |