# benchmarking_datasets

The corresponding metadata for the final isolates can be found can be found in [data/strain.metadata.txt](data/strain.metadata.txt)

## References:
* [Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism–calling pipelines ](http://doi.org/10.1093/gigascience/giaa007)
* [A comparison of tools for the simulation of genomic next-generation sequencing data](https://doi.org/10.1038/nrg.2016.57)
* [Benchmarking bacterial genome-wide association study methods using simulated genomes and phenotypes](https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000337)
* [A roadmap for the generation of benchmarking resources for antimicrobial resistance detection using next generation sequencing](https://doi.org/10.12688/f1000research.39214.1)
* [Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance](https://doi.org/10.7717/peerj.3893)
* [MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies](https://dx.doi.org/10.1099%2Fmgen.0.000206) 
&mdash;see [Supplentary File 2](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6159552/bin/mgen-5-206-s002.xlsx) 

## Links:

* Simulation code for metagenomics including random copy number and insertion of AMR genes (either in a slightly painful seqan based C++ implementation or a simple nextflow workflow.): https://github.com/fmaguire/metagenome_simulator
* https://github.com/CAMI-challenge/CAMISIM: more of a proper taxonomic simulator (i.e., can use 16S taxonomic profiles to simulate the metagenome) but doesn't handle being able to insert/label AMR genes. 
* FDA-ARGOS bioproject : https://www.ncbi.nlm.nih.gov/bioproject/231221 
* https://github.com/metagenlab/MeSS: workflow to simulate metagenomic datasets from published genomes 
* https://www.jpiamr.eu/projects/seq4amr/ 
* https://academic.oup.com/gigascience/pages/data_note
* https://github.com/phac-nml/staramr/blob/master/staramr/databases/resistance/data/ARG_drug_key_resfinder.tsv
* https://github.com/tseemann/injecta : Insert genes into genomes to aid synthetic test data generation
