PathoFact 2.0 Datasets
Authors/Creators
Description
The PathoFact2_datasets.tar.gz archive contains datasets used for training, validation, and benchmarking of PathoFact 2.0. The dataset is organised into the following folders:
1. BenchMarking/
Contains test datasets used to benchmark the performance of PathoFact 2.0 against other prediction tools.
- Toxin_module/
- 1000_TEST_non-toxin_ToxinPred2BenchMarking.faa
Non-toxin test dataset used when benchmarking PathoFact 2.0 against ToxinPred2.
- 1000_TEST_toxin_ToxinPred2BenchMarking.faa
Toxin test dataset used when benchmarking PathoFact 2.0 against ToxinPred2.
- VF_module/
- VF_Test_dataset_VirulentHunterBenchmarking.faa
Virulence factor test dataset used when benchmarking PathoFact 2.0 against VirulentHunter.
- non-VF_Test_dataset_VirulentHunterBenchmarking.faa
Non-virulence-factor test dataset used for benchmarking PathoFact 2.0 against VirulentHunter.
2. Toxin_module/
Contains datasets used for training and validating the PathoFact 2.0 Toxin prediction module.
- Toxin-related.faa
- non-toxin.faa
- splits/ — datasets divided into 80% training and 20% test sets:
- TRAIN_Positive_TOX.faa
- TRAIN_Negative_TOX.faa
- Test_Positive_TOX.faa
- Test_Negative_TOX.faa
3. VF_module/
Contains datasets used for training and validating the PathoFact 2.0 Virulence Factor prediction module.
- VF_dataset.faa
- non-VF.faa
- splits/ — datasets divided into 80% training and 20% test sets:
- TRAIN_Positive_VF.faa
- TRAIN_Negative_VF.faa
- Test_Positive_VF.faa
- Test_Negative_VF.faa
The DOME-ML_PathoFact2.json is the DOME, the community standard for transparent machine learning json file created for PathoFact 2.0.
Files
DOME-ML_PathoFact2.json
Files
(220.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:e97f325d96c1c42742605b81b528f9b8
|
10.0 kB | Preview Download |
|
md5:cf5ad7e8a9c89d58f507cb5c32e46024
|
220.2 MB | Download |
Additional details
Dates
- Created
-
2025-11-12