Published April 29, 2021 | Version v1
Dataset Open

DECIMER V1.0 Datasets

  • 1. Friedrich-Schiller-Universität Jena

Description

The Dataset contains SMILES obtained from PubChem and filtered using DECIMER filtering rules. The Dataset Split into Train and Test datasets. The SELFIES for the corresponding SMILES are also included in the dataset.

1. Dataset 1:  Canonical SMILES + SELFIES

2. Dataset 2: Isomeric SMILES + SELFIES

 Please refer to the paper for more details: 

Files

Files (1.6 GB)

Name Size Download all
md5:be303bf19607ed53b88f709f6e357b45
803.7 MB Download
md5:c13267117aa184dd00eeb0ca6472aaf2
788.5 MB Download

Additional details

Related works

Is required by
Preprint: 10.26434/chemrxiv.14479287.v1 (DOI)