Journal article Open Access

Unwarranted exclusion of intermediate lineage A/B SARS-CoV-2 genomes is inconsistent with the two spillover hypothesis of the origin of COVID-19

Steven E Massey; Adrian Jones; Daouyu Zhang; Yuri Deigin; Steven C Quay

Pekar et al. (2022) propose that SARS-CoV-2 was a zoonotic spillover that first infected humans in the Huanan Seafood Market in Wuhan, China. The basis for their analysis is the hypothesis that there were two spillovers into humans that are recognized by a two-SNV difference, called Lineage A and B, and that the one-SNV intermediate A/B genomes found in numerous human infections are all sequencing errors, implying that the intermediate A/B genomes with a single SNV occurred in unsampled animal hosts. Consequently, confirmation of the existence of an intermediate A/B genome from humans would falsify their hypothesis. Pekar et al. identified and excluded 20 A/B intermediate genomes from their analysis. A variety of exclusion criteria were applied, including low sequencing depth, and the assertion of repeated sequencing errors at lineage defining positions 8782 and 28144.  However, data from GISAID shows that most of the genomes were sequenced to high coverage, contradicting these criteria. The decision to exclude the majority of genomes was based on personal communications, with raw data not being available for inspection. Multiple errors and inconsistencies were observed in the exclusion process. Mapping analysis of a genome from Singapore, dismissed due to an arbitrary read depth cutoff, confirms it as a true intermediate, while an intermediate genome from Wuhan was discarded even though it conformed to the cutoff. Puzzlingly, two genomes from Beijing were discarded despite an average sequencing depth of 2175X. Lastly, we identify a new potential intermediate genome from Guangzhou. Consequently, we find that exclusion of many of the intermediate genomes is unfounded, leaving the conclusion of two natural zoonoses unsupported.

