Published June 28, 2024
| Version v1
Dataset
Open
Enhanced Sequence-Activity Mapping and Evolution of Artificial Metalloenzymes by Active Learning
Description
This entry contains data, pretrained models and supplementary files for our enzyme engineering study:
Title: Enhanced Sequence-Activity Mapping and Evolution of Artificial Metalloenzymes by Active Learning
Journal: ACS Central Science
If you use any of the data or code in the repository, please cite the paper.
The contents of this entry are:
- Sequence embeddings needed for reproducing the code are found in data.zip. Unzip the contents of the folder to
/data
in the code structure. - NGS sequencing analysis and raw data in
NGS analysis.zip
- 10% subset of structures generated with the Rosetta software in
structures.zip
- Pretrained and saved models for plotting, clustering and further prediction are saved in
models.zip
- Raw assay data including our designed libraries by active learning are found in
assay_and_ML_data.zip
Files
assay_and_ML_data.zip
Files
(7.1 GB)
Name | Size | Download all |
---|---|---|
md5:aef5cba9b0abf3f5fc5353bd5c9ca565
|
365.5 kB | Preview Download |
md5:d1d13f569f1e1e652b7b3abc10adf60c
|
1.7 GB | Preview Download |
md5:0ce73b5b2a2e0fac27224ece2679850d
|
598.8 kB | Preview Download |
md5:9fe5c705f194144fdcf6fcbe88816bef
|
352.6 MB | Preview Download |
md5:396ebd69dc5f8e352a0d00354a178a20
|
5.0 GB | Preview Download |
Additional details
Related works
- Is supplement to
- Publication: 10.1021/acscentsci.4c00258 (DOI)
Funding
- Swiss National Science Foundation
- NCCR Catalysis (phase I) 180544
- Swiss National Science Foundation
- NCCR Molecular Systems Engineering 200021_178760
Software
- Repository URL
- https://github.com/lasgroup/ml-protein-design-sav-gold
- Programming language
- Python
- Development Status
- Concept