Published June 21, 2022 | Version v2
On some unpublished early SARS-CoV-2 sequences


The origin of SARS-CoV-2 is still unknown: the chain of events that brought a virus whose close relatives are found in Rhinolophus bats in the Yunnan province and in Laos to the Huanan seafood market in Wuhan in early December 2019 remains to be elucidated. In particular, the non-market patients and the genetically more ancestral Lineage A remain mysterious.

A retrospective analysis identified 174 patients with onset in December, among them only 15 have been sequenced and published, often multiple times. By collating as much data as possible on early cases we found some data on 65 patients with onset in December 2019. Furthermore, we detected two patients who had been sequenced, but whose sequences were never uploaded to a public database and whose raw reads, although published, were not reanalyzed.

We also present some information on the first Beijing patient, who had an onset date of December 17, 2019 and was related to the Huanan market outbreak.

Using the collated information, significant progress has been made towards solving the discrepancies in the early sequences. A phylogeny of 19 early patients is presented, based on onset dates, as well as several tMRCA estimates – falling in late November.



