Published August 19, 2020 | Version v. 1.0.1
Dataset Open

NUclear Receptor Activity (NURA) dataset

  • 1. University of Milano-Bicocca
  • 2. ETH Zurich, University of Milano-Bicocca

Description

Dataset information

NURA (NUclear Receptor Activity) dataset collects curated information on small molecules that modulate nuclear receptors (NRs), to be intended for both pharmacological and toxicological applications. NURA contains bioactivity annotations for 15,247 molecules and 11 selected NRs, and it was obtained by integrating and curating data from toxicological and pharmacological databases (i.e., Tox21, ChEMBL, NR-DBIND and BindingDB). NURA dataset is a useful tool to bridge the gap between toxicology- and medicinal-chemistry-related databases, as it is enriched in terms of number of molecules, structural diversity and covered atomic scaffolds compared to the single sources.  To the best of our knowledge, NURA dataset is the most exhaustive collection of small molecules annotated for their modulation of the chosen nuclear receptors. NURA dataset is intended to support decision-making in pharmacology and toxicology, as well as to contribute to data-driven applications, such as machine learning. 

Content

Three files are provided:

  1. "Nura_v1.0.0.csv" [datafile] dataset containing activity labels for each molecule (rows, identified by a unique ID and the canonical SMILES string) and each nuclear receptor endpoint (columns).
  2. "Nura_v1.0.0_details" [datafile], containing information on the individual records used to generate the dataset.
  3. "curation_pipeline.zip" [software], containing the data curation pipeline in KNIME ("NURA_Dataset.knwf") as well as a help file ('help.pdf').

Additional details on the content and curation pipeline can be found in the uploaded, non peer-reviewed, preprint ("NURApreprint.pdf").

Version information

  • v1.0.1: data curation pipeline uploaded.
  • v.1.0.0: initial upload.

Contact

If you have any question, contact us! (Francesca Grisoni, francesca.grisoni@pharma.ethz.ch; Davide Ballabio, davide.ballabio@unimib.it)

Files

curation_pipeline.zip

Files (515.1 MB)

Name Size Download all
md5:24d62e93663564f0ea7679de02e33362
482.2 MB Preview Download
md5:7516743668f598d73927127660074fe5
3.8 MB Preview Download
md5:44ed28db9e43a6973b7a06438f7c3fc2
27.7 MB Preview Download
md5:646bfbcb2421e0b446548f6cb13a8aff
1.5 MB Preview Download

Additional details

References

  • Valsecchi, Cecile, Francesca Grisoni, Stefano Motta, Laura Bonati, and Davide Ballabio. "NURA: A curated dataset of nuclear receptor modulators." Toxicology and Applied Pharmacology (2020): 115244.