Published March 10, 2022 | Version v1
Model Open

Unified Deep Learning Model for Multitask Reaction Predictions with Explanation

Description

The pretrained model weights and dataset used in T5Chem

Pre-trainied Models

All pre-trained models are trained on 97 million PubChem Molecules with BERT-like self-supervised mask-filling scheme. They will need to be loaded as "--pretrain" models for T5Chem and fine-tuned for any down-streaming tasks. A ready-to-go fine-tuned model for multi-task training (on USPTO_500_MT) is also available.

  • simple_pretrain.tar.bz2: Character-level pubchem pretrained model
  • USPTO_MT_model.tar.bz2: Multi-task USPTO_500_MT (Trained)

Note: For simple_pretrain.tar.bz2, finetune is REQUIRED!

 

Datasets

Name Size Task type
sample ~60,000 (1.9 MB) Multi-task (small sample dataset)
USPTO_TPL 445,115 (9.1 MB) Classification (Reaction Type)
USPTO_MIT 479,035 (18.4 MB) Forward Prediction
USPTO_50k 50,037 (892 KB) Retrosynthesis
C-N coupling 3,955 (774 KB) Regression (Reaction Yield)
USPTO_500_MT 143,535 (54.5 MB) Multi-task
 

 

Files

Files (201.6 MB)

Name Size Download all
md5:41abdeb4e6438146f8dbeed77229b631
792.2 kB Download
md5:5448ea681c715cbc83b7651cbf059944
2.0 MB Download
md5:0c792d4cb198e433291d9b8a22c3ab21
55.9 MB Download
md5:a46765a8b16965961c4d8f06eb0a92fd
57.3 MB Download
md5:44a5f3ae08fe55933404c9398be22f5b
913.1 kB Download
md5:7f263ce46fc2272271f35d20c87980b1
19.3 MB Download
md5:124c50a1fb7a7d55582dcfc11b3223bd
55.9 MB Download
md5:8c5b0df7355b3a68b1a0bb8ccbbde897
9.5 MB Download

Additional details

Software

Repository URL
https://github.com/HelloJocelynLu/t5chem
Programming language
Python
Development Status
Active