The Role of Water in the Stability of Wild Type and Mutant Insulin Dimers

Insulin dimerization and aggregation play important roles in the endogenous deliv-ery of the hormone. One of the important residues at the insulin dimer interface is Phe B24 which is an invariant aromatic anchor that packs towards its own monomer in-side a hydrophobic cavity formed by Val B12 , Leu B15 , Tyr B16 , Cys B19 and Tyr B26 . Using molecular dynamics and free energy simulations in explicit solvent, the structural and dynamical consequences of mutations of Phe at position B24 to Gly, Ala, and d-Ala and the des-PheB25 variant are quantiﬁed. Consistent with experiments it is found that the Gly and Ala modiﬁcations lead to insulin dimers with reduced stability by 4 and 5 kcal/mol from thermodynamic integration and 4 and 8 kcal/mol from results using MM-GBSA, respectively. Given the experimental diﬃculties to quantify the thermodynamic stability of modiﬁed insulin dimers, such computations provide a valuable complement. Interestingly, the Gly-mutant exists as a strongly and a weakly interacting dimer. Analysis of the molecular dynamics simulations shows that this can be explained by water molecules that replace direct monomer-monomer H-bonding contacts at the dimerization interface involving residues B24 to B26. It is concluded that such solvent molecules play an essential role and must be included in future insulin dimerization studies.


Introduction
Insulin is a small, aggregating protein that plays an eminent role in regulating glucose uptake in cells. In its crystal form, the WT hormone is a hexamer consisting of three dimers with either two or four Zn atoms bound to it. 1 Each dimer consists of two monomers (chain A with 21 amino acids and chain B with 30 amino acids), connected by two inter-chain (Cys A7 -Cys B7 , Cys A20 -Cys B19 ) and one intra-chain (Cys A6 -Cys A11 ) disulfide bonds (see Figure 1).
Under physiological conditions insulin monomers readily aggregate to form dimers. Experimental and computational studies have found that the main stabilizing contributions to self-association are nonpolar interactions through directionality provided by hydrogen bonds. [1][2][3][4][5] Dissociation of the dimer to form two monomeric insulins is of great physiological importance as the monomer is the functionally relevant state of the hormone. One suggested possibility to suppress aggregation is to modify the dimerization interface through suitable substitutions 3 or chemical modifications. 6 The current view of the insulin structure-function relationship is derived primarily from insulin hexamer and dimer crystal structures, as well as from studies of the structureactivity correlations of chemically modified and/or naturally occurring mutant insulins in solution. [7][8][9][10][11][12][13][14][15] Mutagenesis of the dimer-forming surface of insulin can yield analogues with a reduced tendency to aggregate and pronounced differences in the pharmacokinetic properties with potentially promising therapeutic applications. 7,16 Typical experimental methods for quantitative studies of insulin dimerization are isothermal titration calorimetry (ITC) 3 or NMR spectroscopy. ITC requires relatively high protein concentrations, while NMR spectroscopy can be slow for such purposes. Several NMR studies of active monomeric insulin mutants show a rearrangement of the C-terminal end of chain B. 8,14 The removal of residues B26-B30 in despentapeptide prevents dimerization without any significant changes in the rest of the molecule. 1,3,17 This truncated insulin is at least 50 % as potent as human insulin. 3,15 A combined Raman spectroscopy and microscopy study of insulin in different aggregation states (monomer, dimer, hexamer and fibril) shows that dimerization damps fluctuations at an intermolecular β-sheet. 18 Experimental alanine scanning finds substitution of alanine at various positions to reduce insulin affinity for the receptor by more than 20-fold. 19 While the residues that are most likely to be directly involved in binding are A1, A2, A3, A19, B12, B23 and B24, any substitution of residues A1-A3 has been shown to impair function. 15 Phenylalanine at position B24 is invariant among insulin sequences and is located at the dimerization interface maintaining the orientation of the B-chain of the monomer. 8,[20][21][22] These observations together with studies of low-potency B24 analogues suggest that the Phe B24 amino acid residue plays an important role 23 in the activity of insulin, while Ser B24 (Ref. 24,25 ), Leu B24 (Ref. 26 ), and His B24 (Ref. 27 ) analogues show reduced binding potency.
However, it was also found that certain B24 substitutions, such as Gly, 16,28 D-Ala, 11,29 D-His, 27 Met and Cha 30 are well tolerated in view of the affinity of insulin to its receptor. The bioactivity of such insulins has also been described as "anomalous" because it can not be readily explained by crystal models.
Experimental data for dimerization free energies of insulin analogues is scarce due to several experimental challenges. The role of Phe B24 in stabilizing the insulin dimer has also been studied to some extent. Unlike WT insulin, the Gly B24 mutant does not dimerize in aqueous solution at pH 1.9. 8 Furthermore, alanine scanning of the dimerization interface revealed that the Ala B24 analogue is monomeric and does not readily aggregate. 31,32 This suggests that the Ala B24 and Gly B24 analogue dimers are less stable than the WT dimer.
Furthermore, ITC measurements of N-methylated insulin dimer analogues at positions B24, B25, and B26 33,34 revealed considerably reduced (by a factor of 5) dimerization capabilities compared with human insulin.
Because the dimer↔monomer equilibrium is one of the essential steps in forming the receptor binding-competent monomeric form of insulin and the process is difficult to study quantitatively by experiments, MD simulations are an attractive alternative to characterize the stability of insulin dimers. In the present work the stability and dynamics of the insulin dimer analogues (Gly B24 , Ala B24 , D-Ala B24 and des-Phe B25 ) are investigated using atomistic simulations in explicit solvent. The relative stabilities of the analogues are compared with the native WT dimer and with qualitative results from experiments. Both, protein-protein and protein-water contacts are analyzed to characterize the role of water 35 and hydrogen bonding at the dimer-forming surface. Previous computational studies of the WT dimer 4,36 and of alanine scans 37 have provided important complementary information to experimental characterizations and are a rational basis to extend such an approach to modified insulins using explicit free energy simulations. Hence, the present work lays the foundation to extend insulin dimerization studies to arbitrary modifications at the interface.
This article is organized as follows: Section 2 introduces the methods. In Section 3 the dynamics and stability of insulin dimers and the inter-molecular hydrogen bonding between monomer-monomer and water-protein (water-bridged) is presented and discussed. Finally conclusions are drawn in Section 4.

Molecular Dynamics Simulations
Molecular dynamics (MD) simulations were carried out using CHARMM 38 together with the "all-atom" CHARMM22 39 force field including the CMAP correction 40,41 and periodic boundary conditions (PBC). Additional validation simulations for the insulin dimer were carried out using Gromacs 42 and the CHARMM36 force field 43 as described in the supporting information. The starting coordinates for the MD simulations were the X-ray structure of the WT porcine insulin dimer resolved at 1.5 Å (Protein Data Bank (PDB 44,45 ), Code: 4INS. 1 The structure contains the coordinates of the insulin dimer and two aggregated Zn atoms. Zn atoms are removed as they been shown to be relevant only for hexamer formation.
Hydrogen atoms were added to the X-ray structure. The resulting structure was used to gen-erate mutants computationally. For this, the Phe residue at position B24 was mutated into Glycine (Gly), Alanine (Ala), and D-Alanine (D-Ala) yielding mutants Gly B24 , Ala B24 , and D-Ala B24 , respectively. Furthermore, the wildtype (WT) insulin dimer without the Phe B25 amino acid on both monomers, a des-Phe B25 mutant was also studied.
The wildtype dimer and mutants were solvated in a 77.6 × 62.8 × 55.8 Å box of TIP3P 46 water molecules. All MD simulations were carried out in explicit water. Water molecules overlapping the protein were removed which leads to a system with approximately 1550 protein atoms and 8495 water molecules. The solvent was equilibrated at 300 K for 30 ps with the insulin frozen. Then 2000 steps of steepest descent (SD) minimization were carried out.
The entire system was heated to 300 K during 15 ps using harmonic constraints with a force constant of 5 kcal/mol Å 2 on the position of the backbone atoms. The system was further equilibrated for 120 ps by gradually decreasing harmonic constraints on the backbone atoms.
For all simulations the Verlet leapfrog integrator was used for time propagation with a time step of 1 fs. A 12 Å cutoff was applied to the shifted electrostatic and switched van der Waals interactions and images for periodic boundary conditions were updated every 10 time steps.
All distances to hydrogen atoms were constrained by using SHAKE. 47 For the WT dimer and mutants (Gly B24 , Ala B24 , D-Ala B24 and des-Phe B25 ) multiple individual trajectories were run starting from different structures taken from the equilibration run (Table S1). Simulations in the N P T and N V T ensembles were run using the extended system constant pressure and temperature (CPT) algorithms 48 with a Hoover thermostat. 49 In addition, microcanonical N V E simulations were run as well.

Calculation of Binding Free Energy
In the present work, the stability of the WT and mutant insulin dimers is determined from two complementary approaches. One of them is the molecular mechanics-generalized Born surface area (MM-GBSA) approach 50,51 and the other one is thermodynamic integration. 52,53 The thermodynamic cycles used for computing the binding free energy ∆G bind using MM-GBSA and thermodynamic integration (TI) are shown in Figures 2A and B  To corroborate the results from the MM-GBSA simulations and to investigate the role of water molecules on the thermodynamic stability, dimer stabilization free energies were also determined from thermodynamic integration (TI) in the presence of explicit water molecules.
TI applies a scaling parameter λ to switch between an initial (state A, λ = 0) and the final (state B, λ = 1) state by gradually damping all nonbonded interactions. The λ = 0 and λ = 1 states correspond to the grown and annihilated nonbonded interactions on either the protein dimer or on the monomer, respectively. Initial coordinates were taken from the equilibrated simulations. TI simulations were performed using CHARMM's PERT module using soft-core potentials [54][55][56] for the LJ interactions that applied only on the repulsive part of the LJ potential (it was used when the electrostatic interactions were turned off). For this computational approach, restraining potentials 57 affecting the translational, rotational and conformational freedom of the protein may be activated and released during the simulations to aid convergence and improve the sampling.
Free energy simulations at each λ−value were carried out in the N P T ensemble, using the Hoover heat-bath method 49 with pressure coupling at T = 298 K, p = 1 atm, and with the masses of the temperature and pressure piston set to roughly 20 % and 2 % of the system's mass, respectively. A friction coefficient of 50 ps −1 was used. The interval 0 < λ < 1 was divided into 40 equidistant steps to ensure accuracy. For each λ−value the system was re-equilibrated for 60 ps followed by 100 ps of dynamics during which information was accumulated. λ was changed from initial to final value using the slow-growth protocol, 58 which allowed the system to re-equilibrate between steps. Contrary to the MM-GBSA simulations (see SI), TI run in this fashion includes all enthalpic and entropic contributions. Assuming that the change in the entropic contribution ∆∆S WT−Mutant remains approximately constant for the various mutants it is expected that the stability ranking from the two methods remains the same which is a testable hypothesis.
The protein dimer stability difference ∆∆G stab = ∆G dimer − 2 × ∆G monomer was computed within the "same trajectory method" 4 such as to close the thermodynamic cycle, see Figure   2B (left panel).
The free energy of mutating Phe B24 (WT) into Gly and Ala was also calculated directly using the dual topology approach 54,59 for mutations F24G and F24A in the dimer (Figure 2, Panel B right). The effect of mutating the B24 side chain on the stability of the protein ∆∆G was calculated according to ∆G 2 -∆G 1 , where ∆G 1 and ∆G 2 are the free energies of mutating the side chain of a WT residue into another residue (mutant), in the aqueous phase, as a sole residue and in the protein, respectively. For these simulations, no restraints were used and the interval 0 < λ < 1 was divided into 34 steps. Windows at the two ends of the λ interval were more finely spaced. For each of these steps the system was re-equilibrated for 30 ps followed by 60 ps of dynamics during which information was accumulated. The results are averages and standard deviations of five runs.

Results and Discussion
In the following, first, the stabilities from MM-GBSA and thermodynamics integration simulations of the dimers are discussed. Then, the findings are discussed in the context of the dynamics and interactions along the dimerization interface. Finally, the role of water molecules is considered.

Dimerization Free Energies
One of the main objectives of the present study is to determine the relative stabilities of mutated insulin dimer analogues relative to the WT dimer. Experimentally, a dimerization energy of −7.2 ± 0.8 kcal/mol 60 in favour of the dimer was determined for WT insulin which compares with ∆G bind = −11.9 ± 6.7 kcal/mol (absolute binding free energy) and  Figure S1). These contributions are consistent with results reported in an earlier MM-GBSA study which found −3.92 and −2.68 kcal/mol, respectively. 4 Experimentally, the des-Phe B25 insulin dimer was also found to be unstable as it reported exclusively monomeric insulin for this variant. 2 The individual contributions (see Table 1) to the binding free energy suggest that stabilization is predominantly due to nonpolar terms E vdW and G solv,nb . The favorable E elec contribution from the two monomers is canceled by the desolvation energy G solv,elec upon dimerization.
This is found for all modified insulins investigated here and supports previous investigations of the WT. 4 Including the entropic contribution T (∆S trans + ∆S rot + ∆S vis ) to ∆G bind , allows more direct comparison with experimentally determined values. For the WT the stabilization free energy is ∆G bind = −16.0 ± 6.9 kcal/mol when accounting for entropic contributions, in qualitative agreement with the experimental binding free energy of −7.2 ± 0.8 kcal/mol. 60 As a comparison, previous simulation work based on one trajectory found an enthalpic stabilization a The standard deviation of −T ∆S trans is not defined, because it is a function of mass, which has constant value. † The standard state is taken to be 1 M as was used by Tidor et al. 61 ‡ Symmetry corrected using σ = 2 in equation 13 from Ref. 61 of −38.6 ± 5.8 kcal/mol for the WT dimer which decreased to −11.9 ± 6.7 kcal/mol when including entropic contributions. 4 This is consistent with earlier studies 61 on the dimerization of WT insulin which found an unfavourable entropic contribution of ≈ 30 kcal/mol (depending on the size and shape of the protein). On the other hand, for relative stabilization free energies upon mutation the entropic part is less important except for the contribution due to vibrations. For the protein variants considered here, T ∆S vib ranges from −2 to 3.5 kcal/mol and hence can contribute up to 6 kcal/mol to the differential stabilization of one protein variant relative to another one.
The results in Table 1 show that Ala B24 is the entropically least favored substitution while des-Phe B25 is most favoured. Adding the entropic and enthalpic contributions, the total binding free energies of dimerization of insulins leads to stabilisation ranging from −16 to −7 kcal/mol among the low-lying analogues, i.e., WT, Gly B24 , Ala B24 and D-Ala B24 . The des-Phe B25 mutant with ∆G bind of ∼ 6 kcal/mol is energetically unfavorable and is expected to be monomeric in solution.
Overall, 46 independent (10 for WT, Gly B24 , Ala B24 and des-Phe B25 , 6 for D-Ala B24 ) free energy simulations were performed, each 10 ns long, which amounts to a typical aggregate of 100 ns for each system studied. These simulations were also run in different statistical mechanical ensembles and Table S1 in the SI provides a comprehensive summary. Considering the enthalpic part of ∆G bind for all the trajectories (WT and B24 insulin dimer analogues) two situations can be distinguished (see Table S1, and Figures S2 to S7 in the SI for illustra- and D-Ala B24 show appreciable differences in binding energies among various trajectories. Specifically, for the Gly B24 mutant a strongly (SI) and a weakly interacting (WI) dimer is found which will be discussed further below.
Dimer stabilization free energies were also determined from thermodynamic integration (TI,  Table 1. The des-Phe B25 variant is unstable in TI as was also found from MM-GBSA. Hence, the TI and MM-GBSA simulations are consistent with one another and support the experimental observation that the Ala B24 mutant is marginally stable/unstable in solution 31,32 which serves as an additional validation of the present simulations.   Table 2. In summary, the MM-GBSA and TI simulations all agree in that WT (Phe B24 ) is most stable, followed by Gly B24 and Ala B24 mutants. The des-Phe B25 variant is unstable. Furthermore, the differential stabilization free energies of two TI simulations differ by 1 to 1.5 kcal/mol which is, however, acceptable given the very different ways in which they were carried out.

The Weakly and Strongly Interacting Gly B24 Dimer
The MM-GBSA simulations found a strongly (SI) and a weakly (WI) interacting Gly B24 dimer. For the SI variant (∆G ≈ −48 kcal/mol) the dimer is stabilized almost as strongly as the WT dimer, and for WI, the dimer is considerably destabilized by almost 30 kcal/mol (∆G ≈ −20 kcal/mol). Figure S1 shows per-residue contributions to the total binding free energies of WT, SI and WI Gly B24 dimers. This analysis indicates that the differences mainly arise from contributions to the electrostatic < E elec > and solvation energy < G elec,desolv > (see Figure S2). For the WT and Gly B24 -SI insulin dimer the per-residue binding free energies follow a similar pattern whereas for Gly B24 -WI they differ (see Figure S1). For instance, most of the residues have favorable contributions to the total binding free energies for WT and SI (orange and yellow bars in Figure S1, respectively), whereas for the WI dimer (purple bars in Figure S1) these contributions are clearly reduced or even reversed which gives rise to reduced stabilization. Four residues of WI destabilize the dimer by > 2 kcal/mol.
One residue that contributes significantly to the differences between WI and SI is Glu B13 (see Figure S1) which makes unfavorable contributions of about 2.5 and 3.4 kcal/mol to the B1 and B2 chain, respectively. The differences were further analyzed and the electrostatic part was found to be primarily responsible for that, see Figure S8. The pattern of the per-residue contributions to the total dimerization free energy (see Figure S1) suggests that the two monomers in WT, SI and WI dimers are equivalent although they are not strictly symmetric as was reported, e.g., for the crystal structure of the B9 (Ser→Glu) mutant insulin dimer 62 which was not symmetric.
In Figure 3B and D two H-bond distances between the side chains of residues His B10 and Glu B13 are reported for SI and WI, respectively. For Gly B24 -SI only a transiently formed intramonomer hydrogen bond is found, see Figure 3B. Contrary to that the two side chains His B10 and Glu B13 form an intramonomer H-bond which makes the donor and acceptor atoms of the two side chains unavailable for dimerization contacts, see Figure 3 D. This, in turn, reduces the stability as determined in the current protocol (MM-GBSA) and partially explains the difference of 20 kcal/mol between the SI and WI-dimer for the Gly B24 mutant. In addition to the loss of ≈ 6 kcal/mol, the dimer stability is also reduced due to four β-sheet

Hydrogen Contacts at the Interface
In the insulin dimer the Phe B24 -Phe B25 -Tyr B26 segment of monomer I (chain B1) forms an antiparallel β-sheet with the adjacent Tyr B26 -Phe B25 -Phe B24 segment of monomer II (chain B2, see Figure 4B). Inter-chain H-bonds are formed between Phe B24 (I) and Tyr B26 (II) and between Phe B24 (II) and Tyr B26 (I). Therefore, substitutions in this region influence dimer formation. Previous work 60 has reported that insulin dimerization is enthalpically controlled and the four inter-monomer H-bonds in the apolar environment are the prime driving force for insulin assembly. However, MD simulations 4 have shown that insulin dimerization primarily results from nonpolar interactions, in particular B24-B26 residues make the largest favorable contributions and the role of the H-bonds is to provide the necessary directionality of the interactions.  63 For these directional H-bonds at the dimerization interface the definition of an angle is not mandatory as the distinction between a "formed/established" and a "broken" H-bond for the purpose of the present analysis is straightforward, see Figure 6. Depending on this definition the persistence times of the dimerization contacts change somewhat but not the conclusions that are drawn from the analysis. (B) Backbone representation of the H-bonds d 1 to d 4 and the weaker CH-S contacts ρ 1 and ρ 2 along the dimerization interface, see also Figure 6. Figure 4A reports the population of H-bonds between the two monomers involving residues 24-26 of chains B1 and B2. These involve (see Figure 4B) intermonomer contacts ρ 1 ( CysB19 SG-CA XB24 ) in monomer I, ρ 2 ( CysB19 SG-CA XB24 ) in monomer II, and intramonomer , where X is Phe for WT or Ala and Gly for the mutants considered and where the first residue belongs to monomer I and the second corresponds to monomer II.
Probability distributions of H-bonds (see Figure 4A Table 3. Except for the B2-B1 PheB24 H-O TyrB26 contact (47 %) 36 these occupancies are in reasonable agreement with previous work.
Given the symmetric nature of the interface such a low occupancy is unexpected and the present results appear to be more realistic. Furthermore, the simulation time in the current work is considerably longer and the size of the simulation box is also larger, hence certain differences are not unreasonable.
For the weakly interacting dimer the H-bond occupancies are considerably reduced to between 0 and 34 %, see Table 3. These direct, inter-monomer H-bond occupancies correspond to the average number of H-bonds found in the H-bond distributions in Figure 4A. The CH· · · S contacts (which are not labelled "H-bonds" here) are weak interactions and for model systems such as C 2 H 2 · · · SH 2 they were found to be stabilized by −1.34 kcal/mol at the MP2/aug-cc-pVQZ level of theory. 64

Role of Water at the Interface
In search for a molecular explanation why the Gly B24 mutant is prone to destabilize, the hydration environment of the intermonomer H-bonds at the dimerization interface was further analyzed. The stability of the insulin dimer interface would decrease if monomer-monomer contacts would be replaced by protein-water interactions. Hence, the interfacial contacts, in particular the four β-sheet H-bonds (see Figure 4B) conserved and/or replaced by watermediation in various insulin dimer analogues are explored. For this, the H-bonds around position B24 were analyzed and the results are summarized in Table 3.
For the WT dimer the water-protein H-bonds are occupied by 10 % to 30 % whereas for the strongly interacting Gly mutant almost no H-bonds to the water are found. Conversely, the weakly interacting Gly-dimer has occupations ranging from 15 % to 68 % which correlates with the reduced number of direct protein-protein H-bonds. This suggests that water molecules along the dimerization interface can replace direct protein-protein contacts. In general, low occupancy of protein-protein contacts implies a high population of water-mediated contacts.
In order to further characterize the stability, inter-and intramonomer contacts and close encounters with solvent water molecules for the different insulin dimers, and to support the findings described so far, additional 40 ns MD simulations for the Ala B24 and Gly B24 mutants and the WT were carried out, see Figure 6). The average RMSD compared to the 4INS reference structures are 2.0 Å, 2.5 Å, and 2.8 Å for the WT Ala B24 , and Gly B24 mutants,  Figure 5b) counts one inter-monomer + one water-protein H-bond. "bridging type I" (Figure 5d) and "bridging type III" (Figure 5f) count two water-protein H-bonds. Because of this the total occupancy can be > 100 %. respectively (see also Figure S10). For these simulations the H-bonds along the dimerization interface and additional geometrical determinants including the water-occupancy were analyzed. This yielded several types of water-protein H-bonds, shown and described in Fig-ures 5a to f. "Bridging" water-protein H-bonds water have significant interactions with both monomers and thus provide stabilization of the dimer. However, in an MM-GBSA analysis where explicit water molecules do not appear, their influence on the stability is not included. The water-protein H-bond occupancy is higher in Gly B24 -WI mutant than in WT and Gly B24 -SI (see Table 3). Throughout the MD simulations of Gly B24 -WI, consistently one or two water molecules are involved in H-bonds between Gly B24 and Tyr B26 . The horizontal blue dashed line denotes the minimum distance of ρ 1 that was sampled in the Gly simulation. Vertical dashed lines indicate key points in the simulations. For feature I ρ 1 is at a minimum (0.5 Å below the 4.1 Å threshold) in the Gly simulation and intermonomer bond breakage is clearly observed for all four intermonomer bonds. For feature II -where ρ 1 meets the 4.1 Å threshold for the Ala B24 mutant and intermonomer bond breakage is observed. Figure 6 reports several important distance time series to further characterize the dimerization interface. They include inter-and intramonomer distances ρ 1 , ρ 2 and d 1 to d 4 , see Figure 4B. In Figure 6 the symbols mark times when one (red cross) or two (green plus) water molecules are within 3.5 Å of both atoms of the intermonomer N-O pair. Individual water molecules have also been found to be relevant in simulations of the insulin monomer in water, 65 in HIV-I protease 66,67 or in controlling rebinding of NO to microperoxidase. 68 For Gly B24 frequent and spontaneous insertion of one or even two water molecules to replace the direct protein-protein NH-O bond is found. This is sometimes but not exclusively accompanied by losing the NH-O contact and differs considerably for the Ala mutant and the WT protein for which the NH-O Hydrogen bonds are intact for most of the simulation and water molecules are considerably less frequently close to the hydrogen bond. Again, the Gly B24 mutant clearly displays a two-state behaviour as already found above: with the NH-O hydrogen bond intact (as is the case for WT and most of the Ala mutant simulations) which corresponds to the SI Gly B24 dimer, or with the H-bond broken and typically replaced by a solvent water which is the situation in the WI dimer.
This analysis also provides molecular-level insight into the origins of intermolecular H-bond breaking and water-insertion. The Phe→(Ala,Gly) mutations replace a bulky phenylalanine residue at position 24 by considerably smaller CH 3 (Ala) or H (Gly) moieties (see Figure 7).
Hence, the mutations lead to increased conformational freedom of the side chains. This in turn affects the distance(s) ρ 1 (and/or ρ 2 ), see Figure 6. As a reference, for the WT protein this distance ranges from 4.5 Å to 5.5 Å but decreases to below 4.0 Å for the Gly mutant.
For the Ala mutant the separation is closer to the situation in the WT protein. The data in Figure 6 suggests that there are two conditions which lead to breaking of the protein-protein hydrogen bond: (1) the distance CysB19 SG-CA X24 in at least one monomer, i.e. ρ 1 or ρ 2 , must be below a threshold of 4.1 Å; and (2) water molecules must be within the vicinity of the intermonomer N-O pair. Furthermore, the data suggests that the smaller ρ 1 or ρ 2 , the Figure 7: Schematic illustration for the relative sizes of residue 24 in wildtype insulin (Phe B24 ) and the two mutants (Ala B24 and Gly B24 ). Cys B19 is also highlighted to show the variation in steric hindrance between the three cases and to indicate the impact on the key distance, ρ 1 or ρ 2 .
larger the distance between the intermonomer N-O pair (e.g. 'I' compared to 'II' in Figure   6). Note also that the behavior of the protein-protein H-bonds were closely connected and the breakage of one was often correlated with the breakage of the others.
The present analysis reveals that in Gly B24 -WI extensive water-mediated H-bonds entirely replace the β-sheet H-bonds. This is also consistent with the DCCM maps (see Figure S9) where the inter monomer H-bonds are absent for the WI dimer. As a result, the Gly B24 -WI dimer has fewer stabilizing inter-monomer interactions resulting in a considerably lower ∆G bind compared to WT and Gly B24 -SI within the present MM-GBSA approach. Ala B24 has inter-monomer H-bond occupancies similar to WT, whereas D-Ala B24 partly differs from WT, in particular one H-bond (D-Ala B24 (monomer II):H· · · Tyr B26 (monomer I):O) is missing throughout the simulation (see Table 3). Consequently, D-Ala B24 was found to have a larger number of water-protein H-bonds than Ala B24 . The inter-molecular hydrogen bonds, one of the main factors driving WT insulin dimerization, are absent in the des-Phe B25 mutant, thereby preventing its dimerization.

Conclusions
Residue PheB24 plays an essential role in insulin folding, assembly, stability, receptor binding and hormonal signalling. 30 In the present work changes in protein dimer stability resulting from mutations at position B24 were studied using (1) MD simulations with explicit water and (2) free energy simulations using MM-GBSA and TI. To the best of our knowledge, the present work is the first systematic computational study of the relative stabilities and dynamics of dimeric insulin analogues at position B24. The simulations support and extend earlier findings of the importance of residue B24 for the structural integrity of the hormone. 33 MM-GBSA and TI provide reliable information about the stabilisation of insulin dimers (WT and mutants). Compared to the experimentally determined stabilisation of −7.2 ± 0.8 kcal/mol for the WT, TI finds −8.4 ± 0.2 kcal/mol which is in good quantitative agreement.
MM-GBSA is useful for qualitative and comparative purposes but not for quantitative studies. On the other hand, the relative stability changes from TI and MM-GBSA agree quite favourably and suggest that MM-GBSA is useful for ranking the stabilities of WT and mutant dimers.
Substitutions at position B24 of the insulin dimer-forming surface by Gly, Ala and D-Ala amino acid residues give dimeric insulin analogues with reduced dimer stability relative to the WT dimer. The des-PheB25 is exclusively monomeric, as was found by NMR experiments and serves as an additional validation in the present work. The presence of a WI and SI variant of the Gly B24 mutant originates from H-bonds which are direct B1↔B2 protein-protein contacts in the SI dimer but water-bridged in the WI dimer and from changes in the orientation of residue B13. This highlights that modifications at one site (B24) can have substantial functional effects. Bridging water molecules replacing H-bonding interactions along dimerization (and oligomerization) interfaces is most likely an ubiquitous feature and should probably be included in estimating dimerization energies which is, however, not routinely done.
Simulations can thus complement existing experiments and provide molecular-level insight for observed differences between chemically related systems. Also, they provide important information for situations in which experiments are technically difficult or impossible as they may be the only direct method to probe the structure, energetics and dynamics at atomic resolution.