Published September 1, 2021
| Version v1
Dataset
Open
Molecule OCR Real images Dataset
Description
Test dataset from paper Image2SMILES: Transformer-based Molecular Optical Recognition Engine. The dataset contains 296 structures: images and Functional Groups SMILES (FG-SMILES). The structures were extracted from 24 papers, which were selected from each volume of Journal of Organic Chemistry (2020).
Files
Molecule_OCR_real_images.zip
Files
(4.9 MB)
Name | Size | Download all |
---|---|---|
md5:450cd4fab2e3b3fd6d897ad99985dd4a
|
4.9 MB | Preview Download |