Published November 20, 2023 | Version v1
Dataset Open

The genomic and epidemiological virulence patterns of Salmonella enterica serovars in the United States

Description

The serovars of Salmonella enterica display dramatic differences in pathogenesis and host preferences. We developed a process (patent pending) for grouping Salmonella isolates and serovars by their public health risk. We collated a curated set of 12,337 S. enterica isolate genomes from human, beef, and bovine sources in the US. After annotating a virulence gene catalog for each isolate, we used unsupervised random forest methods to estimate the proximity (similarity) between isolates based upon the genomic presentation of putative virulence traits  We then grouped isolates (virulence clusters) using hierarchical clustering (Ward's method), used non-parametric bootstrapping to assess cluster stability, and externally validated the clusters against epidemiological virulence measures from FoodNet, the National Outbreak Reporting System (NORS), and US federal sampling of beef products. We identified five stable virulence clusters of S. enterica serovars. Cluster 1 (higher virulence) serovars yielded an annual incidence rate of domestically acquired sporadic cases roughly one and a half times higher than the other four clusters combined (Clusters 2-5, lower virulence). Compared to other clusters, cluster 1 also had a higher proportion of infections leading to hospitalization and was implicated in more foodborne and beef-associated outbreaks, despite being isolated at a similar frequency from beef products as other clusters. We also identified subpopulations within 11 serovars. Remarkably, we found S. Infantis and S. Typhimurium subpopulations that significantly differed in genome length and clinical case presentation. Further, we found that the presence of the pESI plasmid accounted for the genome length differences between the S. Infantis subpopulations. Our results show that S. enterica strains associated with highest incidence of human infections share a common virulence repertoire. This work could be updated regularly and used in combination with foodborne surveillance information to prioritize serovars of public health concern.  

Files contained in this repository will reproduce elements of figures 3,4,6, and 7 of the accompanying PLOS One manuscript. 

 

Files

box_inf.csv

Files (6.6 MB)

Name Size Download all
md5:3aab7fa3128f01c3eb7a0ea589c62fb2
9.5 kB Preview Download
md5:84769dc8aad1d47234331c0b399f22a1
27.9 kB Preview Download
md5:c3d0777a792165760dcceb7f73b4beb7
27 Bytes Preview Download
md5:958d431007c02775106b57b65f9773ce
27 Bytes Preview Download
md5:7ee6227328674763e29f7099c61c4e12
27 Bytes Preview Download
md5:666ab9037017b57ce5e523e7bf10b9da
27 Bytes Preview Download
md5:43dccb6c1963c65764b0b848666192ff
163.6 kB Download
md5:43dccb6c1963c65764b0b848666192ff
163.6 kB Download
md5:172ae7b92fa0feef8d3855d8e67c0ae3
604.5 kB Preview Download
md5:0fd12af1d4e1ecff5bc15a7a13edb57d
4.6 kB Download
md5:62b9f36cff258f8dd5af99fc50d44876
2.1 kB Preview Download
md5:b4a1d8930a1bdb61b71e5e22e38ee44c
5.9 kB Preview Download
md5:ff1e8c4e41f8b4829559fb1a53cbc5bc
20.2 kB Preview Download
md5:3085165a0d17f3cda842e90d98dc6bac
9.9 kB Preview Download
md5:91d1389f67de7378002740205573fab2
21.4 kB Preview Download
md5:689adf2c03cfb65f4834c5c1b7a2ff65
107 Bytes Preview Download
md5:3ec9f9f611a280b7e7bd6dabdef4893c
272 Bytes Preview Download
md5:fe0e34db8eaca42c530266781d5073cc
99.9 kB Preview Download
md5:fbc3c3a61c8e1a330d6dcd96bd0373bb
451.3 kB Preview Download
md5:a35ebacc55ff16ea02b153ad6d00cebb
8.4 kB Download
md5:a7ce0005946a668cba3e06d1c1557ad3
5.1 kB Download
md5:c10cb199560e17a6c291e26982ae12f4
741 Bytes Preview Download
md5:98e072d0b420ea13914a6f96652c9a4a
1.8 kB Preview Download
md5:d8e8cae2d3c00b04543ed10631fbfad8
2.5 MB Preview Download
md5:acced8400f7da00850d56ecb8a5620e2
2.5 MB Preview Download

Additional details

Dates

Accepted
2023-11
Data and code for figures 3,4,6, and 7