Published December 6, 2024 | Version v1
Dataset Open

Molecular dynamics and AlphaFold dataset of fluorinated protein variants

Description

This dataset contains AlphaFold predictions and molecular dynamics (MD) trajectories of 50 fluorinated protein variants. Experimentally measured fluorine NMR chemical shifts are provided in the CSV file. Unless otherwise stated, secondary shifts were calculated using a random coil / reference chemical shift of -61.82 ppm. Proteins were labelled with 4-trifluoromethyl-L-phenylalanine (tfmF), modelled as a tyrosine residue in AlphaFold and explicitly in MD simulations with the CHARMM36m (C36m) and ff15ipq force fields. For each variant, ColabFold and AlphaFold 3 predictions (five models) are included. For MD, each variant contains three independent trajectories lasting one microsecond (150 microseconds total over the entire dataset per force field). Each simulation folder contains distance (in Å) and angle (in degrees) descriptors, which quantify the interaction of tfmF with nearby aromatic residues to predict ring current effects. Each folder also contains the solvent-accessible surface area of the trifluoromethyl group calculated over each trajectory, and the RMSD with respect to the starting structure. General template folders for each force field (and an additional template for HRAS due to custom GDP/ligand parameters) are also provided to allow for reproduction of the setup and production simulations for fluorinated protein variants. 

Excel files summarising the average distance and angle descriptors for each variant and their experimental chemical shifts are provided (averaged over all AlphaFold models or across all three trajectories, discarding the first 200 ns as equilibration). 

The dataset also includes six long MD trajectories (lasting 20 microseconds) of two HemK variants (N-terminal domain, residues 1-73) simulated with the DES-Amber protein force field, as proof-of-concept of fluorinated probes (38tfmF for HemK) reporting on mutation-induced conformational changes. Wild-type (K12 strain) and a mutant (I26V/R34K/Q46R) were simulated.  

Notes

This study was funded by a Wellcome Trust Investigator Award (to J.C., 206409/Z/17/Z). Computational resources were provided by the Baskerville Tier 2 HPC service (https://www.baskerville.ac.uk/). Baskerville was funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1) and is operated by Advanced Research Computing at the University of Birmingham. We are also grateful to the UK Materials and Molecular Modelling Hub for computational resources, which is partially funded by the EPSRC (EP/T022213/1, EP/W032260/1 and EP/P020194/1), and the UCL Kathleen High Performance Computing Facility (Kathleen@UCL), and associated support services.

Files

AlphaFold.zip

Files (17.9 GB)

Name Size Download all
md5:ae2c891ddcb4cdc934666bf1de89b78f
108.4 MB Preview Download
md5:2cd10e1caa50a37226385e3d12ad99d9
1.1 MB Preview Download
md5:5ee1cc920edbb60534046882412da053
2.3 kB Preview Download
md5:69a96ab5b80a5052be82512e951fa8ca
40.1 kB Preview Download
md5:38358362f36ca6cbfc7f59076d55f18f
34.8 kB Preview Download
md5:9f5627ae0a5255668465320361806941
1.1 GB Preview Download
md5:65c64e0631c97c12a36f463f3faa9742
7.2 GB Preview Download
md5:ab6fe7f0a22b7c800a968ba857d470b6
9.5 GB Preview Download
md5:d85e83b9c6e119af3e38eceb21515c33
13.9 kB Download
md5:27847dfc628d411afbf627145341c1b1
16.1 kB Download
md5:4966f9e93a15831dd14be48cf9787bac
13.3 kB Download
md5:683e61119bdf6c27c7743632089d78f4
13.9 kB Download

Additional details

Funding

Wellcome Trust
Integrative Structural Biology of Protein Folding on the Ribosome 206409/Z/17/Z