MotifLeadDB v1
Authors/Creators
Contributors
Data manager:
Description
MotifLeadDB is centered on a strict Core dataset containing 342,489 modeled receptor–ligand complex structures across 357 protein targets. To broaden target diversity beyond the Core set, we additionally included a diverse-only extension, yielding a total of 378,918 modeled receptor–ligand complexes across 396 targets in the full Diverse dataset.
Structural models are organized hierarchically by receptor template, ligand scaffold, and model confidence level, and are provided in three confidence levels (Level 1–3), defined by scaffold alignment accuracy and pharmacophore conservation.
The dataset hierarchy is as follows:
-
Diverse: complete dataset containing all released entries, including the Core set and an additional diverse-only extension introduced to broaden target diversity (396 targets; 378,918 entries).
-
Core: subset retaining only entries supported by mutation-free templates (357 targets; 342,489 entries).
-
Core-NR: subset of Core further restricted to non-redundant ligand assignments (357 targets; 97,173 entries).
-
Core-NR-Act: subset of Core-NR further restricted to scaffold groups with the same activity type and a minimum within-group pActivity range of 0.2 (357 targets; 93,995 entries). Activity-specific tables are distributed separately for pKi, pIC50, and pKd.
-
HC-Core: subset of Core-NR-Act retaining only confidence Level 1 models (342 targets; 61,223 entries). High-confidence activity-specific tables are likewise distributed separately for pKi, pIC50, and pKd.
The dataset includes branch-level tables (core.csv, diverse_only.csv), an integrated full table (tables/diverse.csv), derived Core subset tables, branch-specific structure archives, and metadata tables summarizing model-, ligand-, and target-level annotations.
Each entry is annotated with structural quality metrics, aggregated BindingDB activity values, scaffold grouping, template quality annotations, and dataset subset labels. Template consistency was further examined using SIFTS-based pocket mapping, and ActivityDB-based annotations are also provided where available.
Detailed descriptions of file organization, subset definitions, and CSV column schema are provided in the accompanying README.md.
Files
README.md
Files
(32.2 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:9ff570692470d98ec051e735b34e2b7f
|
28.9 MB | Preview Download |
|
md5:01ecf2171a71aa1c41ff6a794211d790
|
29.0 GB | Preview Download |
|
md5:29b545fbef45517540b92ccb15f88095
|
3.2 GB | Preview Download |
|
md5:55aa744d039d26c38c769820c6e12681
|
13.0 kB | Preview Download |
|
md5:9167a3d8f46dc28badccd6814daa77e7
|
40.7 MB | Preview Download |