There is a newer version of the record available.

Published June 5, 2024 | Version v1
Dataset Open

Reactzyme: A Benchmark for Enzyme-Reaction Prediction

Description

Official dataset of Reactzyme - Reactzyme: A Benchmark forEnzyme-Reaction Prediction.

Our study utilizes a comprehensive dataset compiled from the SwissProt and Rhea databases. SwissProt, a curated subset of the UniProt database, has been selected for its high-quality, human-derived functional annotations of protein sequences. This section of UniProt is particularly valuable for its expert-reviewed entries, which ensure reliable and accurate functional data, making it ideal for our analysis. Rhea is employed for its precise mapping from enzymes to specific catalyzed functions, offering detailed descriptions of biochemical reactions. 

The SwissProt and Rhea dataset are downloaded on January 8, 2024, and includes data entries up to this date, providing the most recent and comprehensive data available for our study. We selectively exclude water molecules and unspecific functional groups that could mask the true molecular structures. Conversely, we remove metal ions, gas molecules, and other small molecules because of their potential to bind to proteins, a characteristic that presents a valuable learning feature for our model. To this end, the total dataset comprises 178,463 positive enzyme-reaction pairs, including 178,327 unique enzymes and 7,726 unique reactions. 

Files

deepchem_vocab.txt

Files (395.8 MB)

Name Size Download all
md5:669bdd627c946114e87f06bffb4f33d9
87.4 MB Download
md5:95ca5f1d57a4a7a82bb3cca0ad742e9c
3.6 kB Preview Download
md5:e351fdb85830968fc9abe933c39f9eda
47.5 MB Preview Download
md5:5a64bef090335f884a767006867d64cf
1.4 kB Download
md5:2d9f4e6c78d8daf5752cc2a5ae2bef0d
46.7 MB Preview Download
md5:cb5a575a08954f6d28311b9a4bef52fe
3.5 MB Download
md5:c437435a239326c157e1d20f00d8e00e
47.6 MB Preview Download
md5:a669647f418bf54dc7c5d0059c2b2a09
62.3 MB Download
md5:5b9d384c96a597680b140c3a333f1600
100.7 MB Download