Published April 13, 2026 | Version v11
Dataset Open

The LOTUS Initiative for Open Natural Products Research: frozen dataset union wikidata (with metadata)

  • 1. ROR icon Institute for Molecular Systems Biology
  • 2. ROR icon Collaborative Drug Discovery (United States)
  • 3. ROR icon University of Fribourg
  • 1. Friedrich-Schiller-University Jena
  • 2. Institute of Organic Chemistry and Biochemistry of the CAS
  • 3. University of Virginia
  • 4. Maastricht University
  • 5. Université de Genève
  • 6. University of Illinois at Chicago
  • 7. Ontario Institute for Cancer Research
  • 8. University of Glasgow

Description

Tabular CSV exports of the LOTUS Initiative (https://doi.org/10.7554/eLife.70780) data from https://www.wikidata.org.

File Description
{date_str}_frozen.csv.gz Core triplet table: each row is a unique (structure InChIKey, organism Wikidata QID, reference Wikidata QID) triple with organism name, reference DOI, and manual validation status.
{date_str}_frozen_metadata.csv.gz Comprehensive metadata table: enriched with structural descriptors (InChI, SMILES, molecular formula, exact mass, stereocenters), chemical classifications (NPClassifier, ClassyFire), biological taxonomy (Open Tree of Life), PubChem compound properties, and literature references (DOI, PMID, PMCID).
{date_str}_changes_report.txt Change report: summary of additions and removals compared to the previous version.
lotus_exporter.py Generator script (marimo notebook): reproduces all output files from live Wikidata. Run with: uv run lotus_exporter.py export -o ./output -v

 

Files

260413_changes_report.txt

Files (111.0 MB)

Name Size Download all
md5:a166d5e34250254389f67605511005a3
374 Bytes Preview Download
md5:cf0cf2afa2ca4d758b68f2e39d466f5d
20.6 MB Download
md5:b17048b3b77daae9ab1e480b6591aabd
90.3 MB Download
md5:af9209282f5e04586f22ed308cce66d7
106.8 kB Download

Additional details

Related works

Is described by
Journal article: 10.7554/eLife.70780 (DOI)
Is new version of
Dataset: 10.5281/zenodo.7534071 (DOI)

Funding

Swiss National Science Foundation
MetaboLinkAI 10002786