Published April 11, 2025
| Version v1
Dataset
Open
Processed training dataset for finetuning pocket-based molecular generation task in Token-Mol 1.0
Description
Processed CrossDocked2020 dataset for training pocket-based molecular generation task in Token-Mol 1.0.
`protein_represent.pkl` is the representation embedding of protein pocket encoded with ResGen encoder.
`mol_input.pkl` is the corresponding ligands to the pocket in the training set presented as SMILES strings, which haved been tokenized.
Files
Files
(4.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:382ac7e37a84acbe332df6319e13f82f
|
11.1 MB | Download |
|
md5:d6b84383566307b3903003844106c65c
|
4.5 GB | Download |