Published January 11, 2021 | Version 0.1.0
Dataset Open

PubChemLite Uploads

  • 1. NIH/NLM/NCBI
  • 2. LCSB, Uni Luxembourg

Description

Official Repository of PubChemLite Exposomics Datasets

PubChemLite is a subset of PubChem (https://pubchem.ncbi.nlm.nih.gov/) selected from major categories of the Table of Contents page at the PubChem Classification Browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=72).

PubChemCIDs have been collapsed by InChIKey first block, reporting the structure from the most annotated CID, plus related CIDs. Entries that will be ignored by MetFrag (salts, disconnected substances) or cause errors (e.g. transition metals) have been removed. The Patent and PubMed ID counts are extracted from files on the PubChem FTP site. The "AnnoTypeCount" term counts how many of the categories are represented, the subsequent column (named per category) counts the number of annotation categories available in the next sub-category of the TOC entry.

PubChemLite exposomics is compiled from 10 categories: AgroChemInfo, BioPathway, DrugMedicInfo, FoodRelated, PharmacoInfo, SafetyInfo, ToxicityInfo, KnownUse, DisorderDisease, Identification.

These files can be used "as is" as localCSV for MetFrag Command Line (https://ipb-halle.github.io/MetFrag/) - please do NOT upload these files directly to the web interface, they are too large and will instead be available in a drop-down menu.

Files

PubChemLite_01Jan2021_exposomics.csv

Files (194.3 MB)

Name Size Download all
md5:075ee1b7f7b00fef8af5a59edc539479
194.3 MB Preview Download