Enzymes from the BRENDA and CAZy databases annotated with organism growth temperatures and predicted Topt
Description
This repo is an updated version of repo Gang Li, & Martin KM Engqvist. (2019). Enzymes from the BRENDA database annotated with organism growth temperatures and predicted Topt (Version 1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2539114.
Experimental as well as predicted organism growth temperatures were used to annotate enzymes from the BRENDA database (doi: 10.1093/nar/gky1048, https://www.brenda-enzymes.org) version 2018.2 (July 2018) and CAZy database (http://www.cazy.org/).
An updated machine learning model was applied to predict the optimal functional temperature of enzymes from BRENDA and CAZy.
There are four files in this repo:
1. 'annotated_brenda.tsv' is a tab-seperated file that contains the annotated enzymes from BRENDA. There are 9 columns in the file: index column; "ec", EC number; "uniprot_id", protein id in Uniprot database; "domain", the domain of life (superkingdom), either Archaea, Bacteria, or Eukarya; "organism", species name; "ogt", optimal growth temperature of the organism; "ogt_note", whether the experimental or predicted ogt is used; "topt", the optimal functional temperature of the enzyme; "topt_note", whether the experimental or predicted topt is used.
2. 'annotated_cazy.tsv' is a tab-seperated file that contains the annotated enzymes from CAZy. There are 12 columns in the file: index column; "family", CAZy family id; "genbank", genbank id; "Protein Name", the protein name from CAZy database; "ec", EC number; "organism", strain name; "uniprot_id", protein id in Uniprot database; "PDB/3D", structure id in PDB database; "ogt", optimal growth temperature of the organism; "ogt_note", whether the experimental or predicted ogt is used; "topt", the optimal functional temperature of the enzyme; "topt_note", whether the experimental or predicted topt is used.
3. 'brenda.sql', which is a SQLite3 database version of 'annotated_brenda.tsv', with an additional column of enzyme sequences.
4. 'cazy.sql', which is a SQLite3 database version of 'annotated_cazy.tsv'', with an additional column of enzyme sequences.
The SQLite3 databases are for the Tome tool (https://github.com/EngqvistLab/Tome), version 2.0.
Files
Files
(4.8 GB)
Name | Size | Download all |
---|---|---|
md5:71db7d042e9e8dddbd3a42a39a6a7234
|
623.9 MB | Download |
md5:5c0675fb86bbfcbcefb0619c678aa884
|
107.1 MB | Download |
md5:356812b7362d00e4f9f264f65fbd6376
|
3.5 GB | Download |
md5:715542dd53c733a76b241d8313aaa57f
|
601.5 MB | Download |