Published September 15, 2025 | Version v1
Dataset Open

CAGE-Fusion Nuisance Compound Detection Dataset

  • 1. ROR icon Texas A&M University
  • 2. Panorama Global
  • 3. ROR icon Bill & Melinda Gates Foundation

Description

This dataset contains curated molecular structures and associated nuisance and assay-interference annotations used to train and evaluate the CAGE-Fusion multimodal deep-learning framework for early drug-discovery screening. The dataset aggregates compounds from multiple public sources and literature-derived annotations, covering common assay-interference categories such as aggregation, reactivity, luciferase inhibition, and promiscuous behavior.

Files

test.csv

Files (23.6 MB)

Name Size Download all
md5:a32f5f848e848d89f8c8d2ec7294e88b
2.3 MB Preview Download
md5:086842d490f17fdfaa2fca42110ce4ad
18.9 MB Preview Download
md5:3a229dfd14c79aeda32147c80437394b
2.4 MB Preview Download

Additional details