Dataset Open Access
PNMRNP is an SDF file that reports the structure, properties and classification of 211,280 natural products.
The starting point of this work (January 2019) was ISDB which contains csv files of the UNPD data base (Gu J et al., PLOS ONE 2013, 8, e62839, doi:10.1371/journal.pone.0062839) and which are packaged with the ISDB mass spectrometry fragmentation database (Allard PM et al., Anal. Chem. 2016, 88, 6, 3317-3323, doi: 10.1021/acs.analchem.5b04804).
Starting from InChI strings of compounds, 2D structures with configuration data were produced mainly from the RDKit cheminformatic toolkit. Pubchem identifiers were searched for the compounds and used as keys to give names and synonyms to molecules. Carbon atoms in molecular structures were associated to 13C NMR chemical shift values using nmrshiftdb2. A three-level classification of compounds according to sub-structure presence is proposed. A compound may be classified as a terpene (level 1) as a sesquiterpene (level 2) and as an eudesmane (level 3).
Version 2 of PNMRNP includes the classification of organic compounds according to ClassyFire.