A curated database of fungal pathogens and their host range
- 1. Hasso Plattner Institute
Description
This database contains a manually curated set of human, animal and plant pathogens, annotated with their confirmed host range and relevant sources. In addition to that, we include additional sets of plant-associated fungi (which may include non-pathogens), as well as fungi with an automatically assigned, putative human, animal or plant host. The labelled fungal species are linked to their representative GenBank genomes wherever possible. Genomes that were screened, but no label was found, are also included.
The database is stored in a flat-file format. All metadata are stored in all_data.csv, and all_data.rds contains the same data in a compressed format that can be easily loaded in R. The core database is limited to manually confirmed human, animal and plant pathogens with available genomes as of 9 Oct 2021. Those data are a subset of all_data, and are stored in core_fungal_pathogens.csv and core_fungal_pathogens.rds.
You may also be interested in trained neural network models predicting pathogenic potentials of novel fungi from DNA sequences (https://zenodo.org/record/5711877) and simulated Illumina read sets used to train them (https://zenodo.org/record/5713153).
See also the preprint: https://www.biorxiv.org/content/10.1101/2021.11.30.470625v1