Published April 14, 2021 | Version v1
Dataset Open

Curated set of 1703 Streptococcus suis genomes

  • 1. Amsterdam UMC, University of Amsterdam, Departments of Medical Microbiology and Global Health, Amsterdam NL
  • 2. Amsterdam UMC, University of Amsterdam, Department of Medical Microbiology, Amsterdam NL



This dataset consists of 1703 genome assemblies of Streptococcus suis, curated by the authors. Data from NCBI SRA and NCBI Assembly have been combined. If an assembly was available through NCBI Assembly, this was used. Otherwise, a de novo genome assembly was created from NCBI SRA data using SKESA v2.1.0 using default parameters. Assembly quality was evaluated using Quast v5.0.2. Any assembly with more than 500 contigs, a N50 lower than 10 kbp, a genome size outside 1.6-3.0 mbp, a GC content outside 40.0-42.5% or more than 50 Ns/100 kbp was excluded. Average nucleotide identity comparisons were made between all genomes using FastANI v1.1. Genomes that did not show at least 95% ANI to the main cluster of Streptococcus suis strains were excluded.


Metadata was extracted from linked publications and curated. Strains were placed in one of three host groups: human hosts, diseased pig hosts, or healthy pig hosts. References are available from the metadata table.


Files (1.1 GB)

Name Size Download all
1.1 GB Download
180.9 kB Download

Additional details


PIGSs – Program for Innovative Global Prevention of Streptococcus suis. 727966
European Commission