Characterization of an infectious clone of a novel Coronavirus closely related to Bat coronavirus HKU4 from PRJNA602160
Creators
Description
This is the full supplementary information, sequence and 3d models associated with the publication "Characterization of an infectious clone of a novel Coronavirus closely related to Bat coronavirus HKU4 from PRJNA602160"
All contig sequences with homology to KJ473822.1 found within PRJNA602160 have been deposited as 10915167.fa , 10915172.fa , 10915173.fa and 10915174.fa
Reads supporting the vector-virus junctions within SRR10915173 have been included in 10915173.fa
The raw MEGAHIT contig sequences from PRJNA602160 have been deposited as SRR10915167_final.contigs.fa.gz , SRR10915172_final.contigs.fa.gz , SRR10915173_final.contigs.fa.gz and SRR10915174_final.contigs.fa.gz
Sequences that were found to be near-identical to MERS-CoV have been deposited as MERS_CoV from SRR10915173.fa and MERS_CoV from SRR10915174.fa
The addgene analysis results supporting fig.1 have been deposited as 10915173_addgene_analysis_result.gb
The annotated genome of the HKU4-related Coronavirus clone found in SRR10915173 have been deposited as 10915173_annotation.gb
Additional analysis of SRR10915173 were performed using CoronaSPADES, and have been deposited as SRR10915173_coronaspades_default.tar.gz
The 3-dimensional model of the RBD of the HKU4-related Coronavirus Clone have been deposited as HKU4_RBD.pdb
Sequences found with homology with the bat Tylonycteris pachypus, their similarity(identity/length of match) to the bat sequence, the most similar sequence on nt and their similarity(identity/length of match) to such sequences have been deposited as Bat_candidate.fa
Methods
Sequencing data and assembly
Using the NCBI STAT phylogenetic analysis tool from the SRA run browser, We identified four sequencing datasets that were positive for Coronaviruses, SRR10915167, SRR10915172, SRR10915173 and SRR10915174, from the HuaZhong Agricultural University Oryza Sativa BioProject PRJNA605983.
These four sequencing datasets are then downloaded and assembled using MEGAHIT[4]. The resulting contig sequences are then searched against the sequence identified from the NCBI STAT phylogenetic analysis tool, BtTp-BetaCoV/GX2012, KJ473822.1. This revealed a complete sequence 32725nt in length which is then identified as being 98.38% similar to the closest related sequence on NCBI, KJ473822.1, from the dataset SRR10915173
An attempt of searching for the natural host of HKU4-related Coronaviruses, the Tylonycteris pachypus bat, were performed on this dataset, however no sequences could be found that identifies as from this species.
Identification of the sequence as an infectious clone
As the Contig sequence was found to be longer than the genome size of Merbecoviruses, 30247nt for HKU4, we performed a BLAST analysis of the sequences flanking the HKU4 genome on this contig, which revealed homology to many expression and cloning vector sequences that were directly fused to the 5’- end and 3’-end of the Coronavirus genome. A BLAST search of the 5’-end and 3’-end of the Coronavirus genome was performed, which verifies the presence of reads covering the Vector-Virus junctions on both the 5’-end and 3’-end of the genome.
Sequence analysis were then performed using the Addgene sequence analyzer[5], which revealed a CMV promoter before the 5’-end of the Coronvirus genome and a bgH polyA signal after the 3’-end of the Coronavirus genome, confirming sequence origin as an infectious clone.
The complete genome of the HKU4-related Coronavirus is manually annotated to indicate all open reading frames (ORFs) and deposited as 10915173_annotation.gb.
Files
Files
(74.1 MB)
Name | Size | Download all |
---|---|---|
md5:ad9cadc2df8b6c997b60a48f6c2df6cf
|
1.5 kB | Download |
md5:c06daa6bc12bb2122913cd9cde585263
|
11.8 kB | Download |
md5:fe5cec37e3457be2cc8772586ac39e38
|
733 Bytes | Download |
md5:7701fa71e75136caabe62253154fe690
|
46.0 kB | Download |
md5:53fb32574c04e55fe2690a8dcf8bb3a4
|
57.5 kB | Download |
md5:823360b670f7449ad8657be079e1451f
|
62.9 kB | Download |
md5:dbf150ee6c47cb5ead5ce1cbec3457ac
|
30.6 kB | Download |
md5:f8767ae8e38b4ab4c0b90a5539b05bbc
|
26.6 kB | Download |
md5:9329ccfa035840031d5349588fe258ad
|
134.8 kB | Download |
md5:86cec7c99fdfab1032f9a19bf0b26633
|
450 Bytes | Download |
md5:951d01034334697b60526801543a3164
|
1.4 kB | Download |
md5:8ff8f5bc1d6cea8a689dba4a6f70ae96
|
2.1 kB | Download |
md5:f9ada0ee28e84755390813c0aa9fed16
|
4.5 MB | Download |
md5:5a164f1bcfb38b4b13bb581d5128fc51
|
7.5 MB | Download |
md5:d39caf7df50db7ca9a34cb0207c2422e
|
8.5 MB | Download |
md5:14fd333524326cca88de4b641b3d2098
|
38.0 MB | Download |
md5:3ee9e022f90858c1052b4074ea24c7a3
|
7.1 MB | Download |
md5:f310c4265c4086255790a8a25b263bf0
|
8.1 MB | Download |