There is a newer version of the record available.

Published April 18, 2023 | Version 1
Dataset Open

EnzymeMap

  • 1. TU Wien
  • 2. IBM
  • 3. Massachusetts Institute of Technology

Description

EnzymeMap (enzymemap_brenda2023.csv) is a large dataset of atom mapped, balanced enzymatic reactions sorted by EC (Enzyme Commission) number.  It is intended to be used for machine learning models for predicting enzymatic reactions or bioretrosynthesis. For details on the extraction, correction and curation of the data, please refer to the publication "EnzymeMap: Curation, validation and data-driven prediction of enzymatic reactions" by E. Heid, D. Probst, W. H. Green and G. K. H. Madsen. Please cite this publication if you use EnzymeMap. A preprint is available at https://doi.org/10.26434/chemrxiv-2023-jzw9w.

The file raw_unmapped_brenda2022.csv furthermore holds raw unmapped, uncurated data used in the publication for retraining of the IBM RXN-for-Chemistry platform.

The origin of EnzymeMap is curated data taken from BRENDA version 2023-1, which was then atom mapped and modestly extended. For some reactions or enzyme classes BRENDA includes additional (uncurated) information not included in EnzymeMap. If one is searching for more information on a particular reaction or enzyme class, we suggest the reader check the corresponding BRENDA entry and the original literature sources.

Files

enzymemap_brenda2023.csv

Files (161.8 MB)

Name Size Download all
md5:9ff47ed49a54a861c4d424821fee9de7
97.7 MB Preview Download
md5:3f7b077cd171b125a3fea44681f9510e
64.1 MB Preview Download

Additional details

Funding

FWF Austrian Science Fund
Computer-aided design of multi-enzyme networks J 4415