Published August 30, 2021 | Version 6
Dataset Open

COCONUT: the COlleCtion of Open NatUral producTs.

  • 1. Friedrich-Schiller University Jena

Description

COCONUT is a COlleCtion of Open NatUral producTs.

 

The database is now available at coconut.naturalproducts.net, where the latest updates will appear before being available here.

To assemble COCONUT, data from 55 open access collections and databases of natural products was retrieved and curated.

This archive contains two files:

  • The MongoDB dump, the most complete version of the dataset, with extensive molecular annotations
  • The COCONUT4MetFrag file, used for MetFrag. The last version of COCONUT4MetFrag is in the file "COCONUT4MetFrag_april.csv"
  • The COCONUT.sdf file containing all unique NP molecules with selected metadata

To restore the dataset in MongoDB:

unzip COCONUT_2021_03.zip
cd COCONUT_2021_03/COCONUT_2021_03/
mongorestore --db=COCONUT --noIndexRestore .

It is generally useful to avoid restoring indexes, as they can interfere with the local installation. Here are the commands to rebuild indexes:

mongo
use COCONUT


db.sourceNaturalProduct.createIndex( {source:1})
db.sourceNaturalProduct.createIndex( {simpleInchi:"hashed"})
db.sourceNaturalProduct.createIndex( {simpleInchiKey:1})
db.sourceNaturalProduct.createIndex( {originalInchiKey:1})
db.sourceNaturalProduct.createIndex( {originalSmiles:"hashed"})
db.sourceNaturalProduct.createIndex( {absoluteSmiles:"hashed"})
db.sourceNaturalProduct.createIndex( {idInSource:1})

db.uniqueNaturalProduct.createIndex( {inchi:"hashed"})
db.uniqueNaturalProduct.createIndex( {inchikey:1})
db.uniqueNaturalProduct.createIndex( {clean_smiles: "hashed"})
db.uniqueNaturalProduct.createIndex( {molecular_formula:1})
db.uniqueNaturalProduct.createIndex( {name:1})
db.uniqueNaturalProduct.createIndex( {coconut_id:1})
db.uniqueNaturalProduct.createIndex( {fragmentsWithSugar:"hashed"})
db.uniqueNaturalProduct.createIndex( {fragments:"hashed"})
db.fragment.createIndex({signature:1})
db.fragment.createIndex({signature:1, withsugar:-1})
db.sourceNaturalProduct.createIndex( {source:1})
db.sourceNaturalProduct.createIndex( {simpleInchi:"hashed"})
db.sourceNaturalProduct.createIndex( {simpleInchiKey:1})
db.sourceNaturalProduct.createIndex( {originalInchiKey:1})
db.sourceNaturalProduct.createIndex( {originalSmiles:"hashed"})
db.sourceNaturalProduct.createIndex( {absoluteSmiles:"hashed"})
db.sourceNaturalProduct.createIndex( {idInSource:1})
db.uniqueNaturalProduct.createIndex( {inchi:"hashed"})
db.uniqueNaturalProduct.createIndex( {inchikey:1})
db.uniqueNaturalProduct.createIndex( {clean_smiles: "hashed"})
db.uniqueNaturalProduct.createIndex( {molecular_formula:1})
db.uniqueNaturalProduct.createIndex( {name:1})
db.uniqueNaturalProduct.createIndex( {coconut_id:1})
db.uniqueNaturalProduct.createIndex( {fragmentsWithSugar:"hashed"})
db.uniqueNaturalProduct.createIndex( {fragments:"hashed"})
db.fragment.createIndex({signature:1})
db.fragment.createIndex({signature:1, withsugar:-1})


This version of COCONUT is beta and will be curated further, but can already be used as it is.

Files

COCONUT4MetFrag_april.csv

Files (8.3 GB)

Name Size Download all
md5:e141da1cd296f8a3a6909b7e72b3b396
3.0 GB Download
md5:c3925f2f8da49047101967438816c9db
123.7 MB Preview Download
md5:2b1ef45ef7d52921fe72f5d38588ba73
5.1 GB Preview Download

Additional details