Published September 13, 2021
| Version v1
Dataset
Open
Supporting data for the manuscript "Nerpa: a tool for discovering biosynthetic gene clusters of nonribosomal peptides"
Creators
- 1. St. Petersburg State University, Sirius University of Science and Technology
- 2. St. Petersburg State University, Sirius University of Science and Technology, St. Petersburg Electrotechnical University "LETI"
- 3. University of California San Diego
- 4. Carnegie Mellon University
Description
Preprocessed structures of nonribosomal peptides [NRPs] and genomic sequences (reference and representative genomes, biosynthetic gene clusters [BGCs]) used in the benchmark experiments in the Nerpa paper.
Files description
- mibig_nrp_bacteria_preprocessed.tar.gz contains the preprocessed dataset of 194 bacterial NRP BGCs from the MIBiG database.
- mibig_nrp_bacteria_summary.tsv contains metadata for the MIBiG-NRP dataset.
- bacterial_ref_and_repr_genomes_20210604_preprocessed.tar.gz contains the preprocessed dataset of 13,399 reference and representative bacterial genomes from the NCBI RefSeq database (retrieved on 2021/06/04).
- bacterial_ref_and_repr_genomes_20210604_summary.txt contains metadata for the RefSeq dataset.
- pnrpdb_preprocessed.info contains the Nerpa-preprocessed pNRPdb database, a database of 8,368 known and putative NRP structures.
- pnrpdb_summary.tsv contains the pNRPdb database metadata.
Files
bacterial_ref_and_repr_genomes_20210604_summary.txt
Files
(62.0 MB)
Name | Size | Download all |
---|---|---|
md5:9461f60dda8308ee93aa027b214882b2
|
53.9 MB | Download |
md5:63d40bac7d9670880299266e67945bd4
|
4.7 MB | Preview Download |
md5:ceca8f21627352076bdc7ec36b0f8769
|
305.0 kB | Download |
md5:94bc9f4dfea1bf0d285391de5dd42a89
|
39.0 kB | Download |
md5:452a3ccafe5d5517467604cbfba701da
|
879.5 kB | Download |
md5:e1ff8badab5167d59129580efe5300e3
|
2.1 MB | Download |