5172018
doi
10.5281/zenodo.5172018
oai:zenodo.org:5172018
Lorenz C. Blum
UniBe
Lars Ruddigkeit
UniBe
Ruud van Deursen
UniBe
Jean-Louis Reymond
UniBe
GDB Databases
Tobias Fink
UniBe
doi:10.1021/ci600423u
doi:10.1002/anie.200462457
doi:10.1021/ja902302h
doi:10.1021/ci300415d
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
Chemoinformatics
Generated database
Virtual screening
In Silico
Chemical space
Druglike small molecules
<p><strong>About</strong></p>
<p>GDB-11 enumerates small organic molecules up to 11 atoms of C, N, O and F following simple chemical stability and synthetic feasibility rules.<br>
GDB-13 enumerates small organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules. With 977 468 314 structures, GDB-13 is the largest publicly available small organic molecule database to date.</p>
<p><strong>How to cite</strong></p>
<p>To cite GDB-11, please reference:</p>
<blockquote>
<p><a href="http://dx.doi.org/10.1021/ci600423u">Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physico-chemical properties, compound classes and drug discovery</a>. Fink, T.; Reymond, J.-L. J. <em>Chem. Inf. Model.</em> <strong>2007</strong>, 47, 342-353.</p>
<p><a href="http://dx.doi.org/10.1002/anie.200462457">Virtual Exploration of the Small Molecule Chemical Universe below 160 Daltons</a>. Fink, T.; Bruggesser, H.; Reymond, J.-L. <em>Angew. Chem. Int. Ed.</em> <strong>2005</strong>, 44, 1504-1508.</p>
</blockquote>
<p>To cite GDB-13, please reference:</p>
<blockquote>
<p><a href="http://pubs.acs.org/doi/abs/10.1021/ja902302h">970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13</a>. Blum L. C.; Reymond J.-L. <em>J. Am. Chem. Soc.</em>, <strong>2009</strong>, 131, 8732-8733.</p>
</blockquote>
<p>To cite GDB-17, please reference:</p>
<blockquote>
<p><a href="http://pubs.acs.org/doi/abs/10.1021/ci300415d">Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17</a>. Ruddigkeit Lars, van Deursen Ruud, Blum L. C.; Reymond J.-L. <em>J. Chem. Inf. Model.</em>, <strong>2012</strong>, 52, 2864-2875.</p>
</blockquote>
<p><strong>Download </strong></p>
<p>You can download the databases and subsets of it using the links provided. All the molecules are stored in dearomatized, canonized SMILES format and compressed as tar/gz archive (for Windows users: Download <a href="http://www.7zip.org/">7-zip</a> to open archives).</p>
<p><br>
<strong>GDB-17</strong><br>
GDB-17-Set (50 million) GDB17.50000000.smi.gz 314 MB<br>
Lead-like Set (100-350 MW & 1-3 clogP)(11 million) GDB17.50000000LL.smi.gz 75 MB<br>
Lead-like Set (100-350 MW & 1-3 clogP) without small rings (3-4 ring atoms)(0.8 million) GDB17.50000000LLnoSR.smi.gz 55 MB</p>
<p><strong>GDB-13</strong><br>
Entire GDB-13 (including all C/N/O/Cl/S molecules) gdb13.tgz 2.6 GB<br>
GDB-13 Subsets (The sum of all the subsets below correspond to the entire GDB-13 above)<br>
Graph subset (saturated hydrocarbons) gdb13.g.tgz 1.1 MB<br>
Skeleton subset (unsaturated hydrocarbons) gdb13.sk.tgz 14 MB<br>
Only carbon & nitrogen containing molecules gdb13.cn.tgz 443 MB<br>
Only carbon & oxygen containing molecules gdb13.co.tgz 299 MB<br>
Only carbon & nitrogen & oxygen containing molecules gdb13.cno.tgz 1.8 GB<br>
Chlorine & sulphur containing molecules gdb13.cls.tgz 189 MB</p>
<p>GDB-13 Subsets (For details please refer to the Table 2 in <em>J Comput Aided Mol Des</em> <strong>2011</strong> 25:637 to 647)<br>
GDB-13 Subset AB (~635 Millions) AB.smi.gz 2.4 GB<br>
GDB-13 Subset ABC (~441 Millions) ABC.smi.gz 1.7 GB<br>
GDB-13 Subset ABCD (~277 Millions) ABCD.smi.gz 1.1 GB<br>
GDB-13 Subset ABCDE (~140 Millions) ABCDE.smi.gz 565 MB<br>
GDB-13 Subset ABCDEF (~43 Millions) ABCDEF.smi.gz 171 MB<br>
GDB-13 Subset ABCDEFG (~13 Millions) ABCDEFG.smi.gz 50 MB<br>
GDB-13 Subset ABCDEFGH (~1.4 Millions) ABCDEFGH.smi.gz 6.2 MB<br>
GDB-13 Random Sample. Annotated with frequency and log-likelihood (Please refer to Exploring the GDB-13 chemical space using deep generative models)<br>
GDB-13 Random Sample (1 Million) gdb13.1M.freq.ll.smi.gz 14.8 MB</p>
<p><strong>FDB-17</strong><br>
FDB-17 FDB-17-fragmentset.smi.gz 62.2 MB</p>
<p><br>
<strong>GDB4c</strong><br>
GDB4c (SMILES) GDB4c.smi.gz 6.2 MB<br>
GDB4c3D (SMILES) GDB4c3D.smi.gz 161 MB<br>
GDB4c3D (SDF) GDB4c3D.sdf.tar.gz 2 GB</p>
<p><br>
<strong>Other</strong><br>
GDBMedChem (SMILES) GDBMedChem.smi 276 MB<br>
GDBChEMBL (SMILES) GDBChEMBL.smi 353.6 MB<br>
GDB-13 random selection (1 million) gdb13.rand1M.smi.gz 7.2 MB<br>
Fragment-like subset (Rule of three) gdb13.frl.tgz 1.2 GB<br>
Dark matter universe up to 9 heavy atoms dmu9.tgz 87 MB</p>
<p><strong>GDB-11</strong><br>
Entire GDB-11 (including all C/N/O/F molecules) gdb11.tgz 122 MB<br>
Fragrance Like Subsets: For details please refer to Ruddigkeit <em>et al. Journal of Cheminformatics</em> <strong>2014</strong>, 6:27<br>
FragranceDB (SuperScent + Flavornet) FragranceDB.smi 56 KB<br>
TasteDB (SuperSweet + BitterDB) TasteDB.smi 44 KB<br>
FragranceDB.FL (Fragrance-like subset of FragranceDB) FragranceDB.FL.smi 32 KB<br>
ChEMBL.FL (Fragrance-like subset of ChEMBL) ChEMBL.FL.smi 452 KB<br>
PubChem.FL Fragrance-like subset of PubChem PubChem.FL.smi 20 MB<br>
ZINC.FL (Fragrance-like subset of ZINC) ZINC.FL.smi 1.3 MB<br>
GDB-13.FL (Fragrance-like subset of GDB-13) GDB-13.FL.smi.gz 165 MB</p>
<p><strong>Terms and conditions</strong>: The GDB databases may be downloaded free of charge. In published research involving GDB, cite the appropriate references mentioned above. GDB must not be used as part of or in patents. GDB and large portions thereof must not be redistributed without the express written permission of Jean-Louis Reymond.</p>
Zenodo
2021-08-09
info:eu-repo/semantics/other
5172017
1662036729.773676
50756
md5:73d518abccea673932a59410baf8b6ae
https://zenodo.org/records/5172018/files/FragranceDB.smi
29798
md5:f38fa3339acc28ba3d520856f60a512f
https://zenodo.org/records/5172018/files/FragranceDB.FL.smi
457184
md5:6a60044a4c744ff699cda2d0b05a2283
https://zenodo.org/records/5172018/files/ChEMBL.FL.smi
90378764
md5:0eda0e2983a8957153b52d81dc82e289
https://zenodo.org/records/5172018/files/dmu9.tgz
65192444
md5:258b948a683fd35196f559aa5b7fa957
https://zenodo.org/records/5172018/files/FDB-17-fragmentset.smi.gz
41297
md5:6ab60aaca88e20b4aeee17b6c87a9f36
https://zenodo.org/records/5172018/files/TasteDB.smi
353552402
md5:afcb6f6092b844413525f4a8abdff17d
https://zenodo.org/records/5172018/files/GDBMedChem.smi
20930930
md5:91f6658d545495e3aa5ee85e26d50c5e
https://zenodo.org/records/5172018/files/PubChem.FL.smi
6354616
md5:5951395313159763236cf07e20549389
https://zenodo.org/records/5172018/files/GDB4c.smi.gz
289811401
md5:9534659e97ac71835ea440794a360c8e
https://zenodo.org/records/5172018/files/GDBChEMBL.smi
2153977
md5:dd7f296ca02c91ea4a61970b082d81dc
https://zenodo.org/records/5172018/files/ZINC.FL.smi
165235800
md5:87ffcee18e7221dd80c01ba613a7f586
https://zenodo.org/records/5172018/files/GDB4c3D.smi.gz
2188638397
md5:c5bfa22f813a4a7d2a7dc86fed2679ce
https://zenodo.org/records/5172018/files/GDB4c3D.sdf.tar.gz
328828730
md5:0e307987b8c970184c34ce51a8beb1ac
https://zenodo.org/records/5172018/files/GDB17.50000000.smi.gz
52149975
md5:8b6d19bdbf68c6a6c4a072e0ff1711d9
https://zenodo.org/records/5172018/files/GDB13_Subset-ABCDEFG.smi.gz
14464008
md5:dd96c04e422e76174b13133850f4c3e4
https://zenodo.org/records/5172018/files/gdb13.sk.tgz
6459037
md5:13c2cda3d40d8808ced5026aeb3a3a01
https://zenodo.org/records/5172018/files/GDB13_Subset-ABCDEFGH.smi.gz
178830700
md5:74a887dae58659b6d1f2216aa9f702bc
https://zenodo.org/records/5172018/files/GDB13_Subset-ABCDEF.smi.gz
56708564
md5:f8147fe50ec98ae8f24d45bad0c31b7e
https://zenodo.org/records/5172018/files/GDB17.50000000LLnoSR.smi.gz
312956230
md5:caf8e9b4e04f2e87b8745934a6064dfd
https://zenodo.org/records/5172018/files/gdb13.co.tgz
171952947
md5:c78ba315f6eccae6b06c9af4f271bf5e
https://zenodo.org/records/5172018/files/GDB-13.FL.smi.gz
15569095
md5:766f3c9ce8b1499bb223ad88ee785586
https://zenodo.org/records/5172018/files/gdb13.1M.freq.ll.smi.gz
1148311
md5:32e62e549a735da5fca3ec26d3ec768f
https://zenodo.org/records/5172018/files/gdb13.g.tgz
7527624
md5:9535d66da275194ecd7edde77e39db33
https://zenodo.org/records/5172018/files/gdb13.rand1M.smi.gz
122040261
md5:23be109329ca081c3fce5cff02a0a5c9
https://zenodo.org/records/5172018/files/gdb11.tgz
1286277759
md5:1b75ed723364b1a5ab767640a48adee7
https://zenodo.org/records/5172018/files/gdb13.frl.tgz
198118375
md5:3a84ee6ff6fb6b0ac829fb187faf5986
https://zenodo.org/records/5172018/files/gdb13.cls.tgz
1845496835
md5:759cc2345f94f94aafd8157a78ad863c
https://zenodo.org/records/5172018/files/gdb13.cno.tgz
464201272
md5:49e806acd6eace5a60351af4b30aa14b
https://zenodo.org/records/5172018/files/gdb13.cn.tgz
2836379943
md5:756b3a359bb653dec1aad6cb0b8150aa
https://zenodo.org/records/5172018/files/gdb13.tgz
2571846186
md5:9b6031322f6d4e709be48df18cc2daf1
https://zenodo.org/records/5172018/files/GDB13_Subset-AB.smi.gz
1779215915
md5:d39e9770901f6dcd91b7bfe1cfed92eb
https://zenodo.org/records/5172018/files/GDB13_Subset-ABC.smi.gz
77829822
md5:6e2222dd95391d2e1bddf9f2414fe75e
https://zenodo.org/records/5172018/files/GDB17.50000000LL.smi.gz
1127352040
md5:e83d7e450b71bb4d44bd24c4b9b31eb7
https://zenodo.org/records/5172018/files/GDB13_Subset-ABCD.smi.gz
591029632
md5:930e81695e4d7bbedc7743d1bbaf03bf
https://zenodo.org/records/5172018/files/GDB13_Subset-ABCDE.smi.gz
public
10.1021/ci600423u
References
doi
10.1002/anie.200462457
References
doi
10.1021/ja902302h
References
doi
10.1021/ci300415d
References
doi
10.5281/zenodo.5172017
isVersionOf
doi