Vibrational Spectroscopy of N3- in the Gas and Condensed Phase.

Azido-derivatized amino acids are potentially useful, positionally resolved spectroscopic probes for studying the structural dynamics of proteins and macromolecules in solution. To this end, a computational model for the vibrational modes of N3- based on accurate electronic structure calculations and a reproducing kernel Hilbert space representation of the potential energy surface for the internal degrees of freedom is developed. Fully dimensional quantum bound state calculations yield the antisymmetric stretch vibration at 1974 cm-1 compared with 1986 cm-1 from experiment. This mode shifts by 64 cm-1 (from the frequency distribution) and 74 cm-1 (from the IR line shape) to the blue, respectively, compared with 61 cm-1 from experiment for N3- in water. The decay time of the frequency fluctuation correlation function is 1.1 ps, which is in good agreement with experiment (1.2-1.3 ps) and the full width at half maximum of the asymmetric stretch in solution is 18.5 cm-1 compared with 25.2 cm-1 from experiment. A computationally more efficient analysis based on instantaneous normal modes is shown to provide comparable, albeit somewhat less quantitative results compared to solving the three-dimensional Schrödinger equation for the fundamental vibrations.


Introduction
Characterizing the structural and functional dynamics of complex systems in the condensed phase is a challenging problem, spanning several spatial and temporal scales. 1 One particularly elegant way to quantitatively assess the structure and dynamics of the solvent environment surrounding a probe molecule is to use optical spectroscopy, especially 1-and 2-dimensional infrared (1D-IR, 2D-IR) spectroscopy. 2 Two-dimensional IR spectroscopy is a powerful method for measuring the subpicosecond to picosecond dynamics in condensedphase systems and considerably extends the toolbox of optical spectroscopy.
Using 2D-IR spectroscopy, the coupling between inter-and intramolecular degrees of freedom such as the hydrogen bonding network in solution, or structural features of biological macromolecules can be investigated by monitoring the fluctuation of fundamental vibrational frequencies of a probe molecule or ligand attached to a complex or a biological macromolecule. For quite some time, the amide-I stretching frequency has been used for this and has provided fundamental, molecular-level insight into the structural dynamics of small molecules and proteins alike. 3,4 For example, the CO chromophore was used as a probe to investigate the solvation dynamics of N-Methylacetamide in water. 3,[5][6][7][8][9][10] Similarly, the CN − stretching frequency has been used as a probe to investigate its own solvation dynamics [11][12][13][14] and structural and energetic features of a protein-ligand complex by attaching the probe to benzene. 15,16 A versatile, spatially sensitive spectroscopic probe can also report on the dynamics of a system while minimizing the perturbations induced. The fundamental modes of a triatomic ligand such as the azide ion (N − 3 ) are weakly coupled to the molecular framework it is bound to and can act as a sensitive probe for the environmental dynamics. The asymmetric stretching mode of N − 3 has a large oscillator strength and a vibrational transition frequency well separated from most organic chromophores. This makes it an ideal spectroscopic probe. [17][18][19][20][21][22][23][24] Other triatomic anions such as SCN − , NCS − or OCN − have also served as spectroscopic probes. 25,26 For a quantitative molecular level description of the environmental dynamics, a high-quality potential energy surface (PES) for the internal vibrational dynamics of the probe is required.
Typically, differences of a few cm −1 are found when immersing the same probe into two different electrostatic environments, e.g. within a wild type and a single mutant protein. 15,16 Here, we use a combination of normal mode and numerically exact bound state calculations together with MD simulations in the gas-and condensed phase to characterize the vibra-tional dynamics of N − 3 with the aim to provide the basis for its use as a covalently linked, positionally sensitive spectroscopic probe. The instantaneous deviation in the solute-solvent interactions is reflected by the fluctuation in the transition frequency of the fundamental modes. The time scale of the structural changes around the solute molecule can be determined from the decay of the frequency fluctuation correlation function (FFCF) from which also the lineshape of the 1D-IR spectrum can be obtained. 4 The vibrational dynamics of N − 3 in the gas phase and solution has been investigated from experiment and computation. Experimentally, the NN asymmetric stretching frequency of the azide ion was determined to be at 1986.47 and 2047.5 cm −1 in the gas phase and bulk water, respectively 20,27,28 which make it a potentially useful environmental probe for protein dynamics, see Figure S1. Using non-linear infrared spectroscopy vibrational lifetimes of the asymmetric stretch fundamental of azide anion in water have been measured experimentally to be T ≈ 1.2 ps (in H 2 O) 29 and T = 1.3 ps (in D 2 O). 18 The band width of the N − 3 asymmetric stretch in bulk water is 25.2 cm −1 . 20,29 Due to the triple bond character of azide, 30 its structure would stabilize through interaction with the solvent which leads to a blue shift (61 cm −1 ) compared to the gas phase frequency.
Recently, an explicitly parametrized PES was computed using an accurate composite method based on pointwise, individually scaled fc-CCSD(T*)-F12b calculations including scalar relativistic effects and the aug-cc-pV5Z basis set. Using this PES, variational calculations determined the lower bound states for N − 3 in the gas phase with spectroscopic accuracy. 31 However, such a composite method is not feasible for the ultimate purpose of the present development, which is an accurate representation of the inter-and intramolecular interactions for N − 3 as a spectroscopic probe covalently linked to another molecular building block.
In order to 1) accurately capture the energetics of distorted N − 3 geometries away from the equilibrium structure and 2) to cover situations in which the probe (N − 3 ) and the moiety it is linked to (e.g. an amino acid or a small organic molecule) are coupled electronically, it was decided to compute the PES at the multi reference CI (MRCI) level of theory. This also allows to extend the grid to a range that covers geometries further away from the equilibrium to sample conformations in MD simulations in sterically demanding environments due to local interactions and constraining effects.
In an effort to quantitatively capture the environmental dynamics, the present work introduces a computational model for the azido group capable of describing the molecular vibrations of the probe in gas-and the condensed-phase in a realistic manner. To this end, an accurate intramolecular, fully dimensional PES for N − 3 is computed based on high-level electronic structure calculations and represented as a reproducing kernel Hilbert space (RKHS).
The vibrational transition frequencies of the fundamental modes are calculated from instantaneous normal mode (INM) analysis and from solutions of the nuclear Schrödinger equation.
Results from bound state calculations including the vibrational line shapes from frequency correlation functions and the time scales for the environmental dynamics derived from them are then compared with data from experiments in the gas-and condensed phase.
First, the computational methods are presented. This is followed by quantum bound state and wave packet calculations to determine the stationary states for the low lying bound states. Next, the solvation dynamics in water is considered and the 1D-IR and the frequency fluctuation correlation functions, which are directly related to 2D-IR spectroscopy, are determined and discussed.

Methods
The Potential Energy Surface for N −

3
The ab initio energies were computed at the multi reference configuration interaction (MRCI) level of theory including the Davidson correction (MRCI+Q) with the augmented Dunning type correlation consistent polarized quadruple zeta (aug-cc-pVQZ) basis set using the Molpro software. 32,33 All calculations were carried out for the 1 A electronic state in C s symmetry.
Initial orbitals for the MRCI calculations were computed using the state averaged complete active space self-consistent field (SA-CASSCF) method for the first two 1 A states with 16 electrons in 12 active orbitals and the 1s orbitals of the nitrogen atoms were kept frozen.
The ab initio energies were calculated in Jacobi coordinates (R, r, θ) (see the inset in Figure   1), where r is the distance between the two nitrogen atoms (N1 and N2), R is the distance between their center of mass and the third N atom (N3), and θ is the angle between r For the 1-dimensional kernels the linear problem f (x i ) = j α j k(x i , x j ) is solved which yields the coefficients α j from which the value of the function f (x) = i α i k(x i , x) can be obtained for arbitrary x. The 3-dimensional kernel K(X, X ) is then a tensor product of three 1-dimensional kernels. 36,37 Based on this, the 3-dimensional kernel K is where X stands for all dimensions involved and which maps the angle θ onto the interval [0, 1]. The reciprocal power decay kernel (k (n,m) ) with smoothness n = 2 and asymptotic decay m = 6 is used for the radial dimensions (i.e., r and R) where x > and x < are the larger and smaller values of x and x , respectively. For the angular degree of freedom, a Taylor spline kernel is used where z > and z < are similar to x > and x < . Here, homonuclear Jacobi coordinates are employed because the PES is expressed in this coordinate system and the necessary angular integrals can be done efficiently and accurately using Gauss-Legendre quadrature. However, the problem can also be solved using other coordinate systems, e.g. bond-length bond-angle or Radau coordinates (ideal for heavylight-heavy systems). 39 Ultimately, the choice of coordinate system primarily affects the rate of convergence of the eigenvalues which is relevant for highly excited states, but less so for the fundamentals considered in the present work. Here, the initial wave function is described To cross-validate the results from DVR3D, the Schrödinger equation for N − 3 was also solved for J = 0 following a time dependent wave packet approach. 40 Within this methodology, an initial wave packet |ψ(0) is propagated using the split operator 41 method, followed by  Figure S2 and is shown to be identical to the one for simulations without the solute.

Quantum Bound State Calculations
For simulations in the condensed phase, nonbonded parameters for the N − 3 solute are required. The Lennard-Jones parameters for the nitrogen atoms were = 0.08485 kcal/mol and R min /2 = 1.66Å. 45 Charges are calculated following the Mulliken population analysis from the density matrix obtained at the MRCI/aug-cc-pVQZ level of theory. This yields a charge of −0.53462e for the terminal nitrogen atoms and 0.06924e for the central one. As an extension, multipolar representations of the electrostatics could also be considered although for closed-shell systems (e.g. N-methyl-acetamide) it has been found that point charge models can be quite reliable for spectroscopic 46 and thermodynamic 47,48 properties.
From the production simulation, 1.2 × 10 6 snapshots are taken as a time-ordered series for computing the frequency fluctuation correlation function and the 1D-IR spectrum. The To compute the line shape function g(t), the FFCF is fitted to a general expression 49 where a i , τ i , γ and ∆ 0 are fitting parameters. The parametrization of this fitting function is motivated by the overall shape of the FFCF 49 and has been used in previous work. 14,50 It is an extension of the typical multiexponential decay, which is traditionally employed, 18 in order to capture the anticorrelation at short times. Furthermore, this functional form also allows analytic integration 4 to obtain g(t) in Eq. 5. The decay times τ i of the frequency fluctuation correlation function reflect the characteristic time-scale of the solvent fluctuations to which the solute degrees of freedom are coupled.
The 1D-IR spectra is then calculated as 20 where ω is the average transition frequency obtained from the distribution,

Spectroscopy and Dynamics in the Gas Phase
The quantum bound states for N − 3 were first calculated using DVR3D for J = 0 and 1. To validate these results the bound states for J = 0 were also determined from time-dependent wave packet calculations. The first 9 bound states for J = 0 obtained from both are given in Table S1 in the SI. For the fundamental stretching modes the quantum numbers were assigned upon inspection of the nodal structure of the wave functions, see The three fundamental transition frequencies obtained from different methodologies are reported in Table 1 along with the experimentally 27,28,51 and theoretically 31 determined frequencies from the literature. The normal modes shown in Table 1 were calculated using CHARMM together with the RKHS PES. It can be seen that except for the asymmetric stretch, anharmonic and harmonic frequencies agree to within < 5 cm −1 . Experimentally, the NN asymmetric stretch in the gas phase was found at 1986.47 cm agree favourably with the present calculations, see Table 1.
As an independent validation of the bound state calculations, N V E MD simulations with the RKHS-MRCI PES were carried out for N − 3 in the gas phase using CHARMM. Power spectra for the distances r i (Ni-Nj separation) were determined and compared with results from bound state calculations. Similar analyses were carried out for N − 3 in solution. Moreover, the infrared spectra for N − 3 in water was also calculated from the dipole moment autocorrelation     Figure 5 and Table 2.

Dynamics and Spectroscopy in Solution
To serve as a positionally resolved spectroscopic probe, it is important to characterize the vibrational spectroscopy of N − 3 in solution. The power spectra (see Figure 5) Table 2: Vibrational Frequencies cm −1 for N − 3 in solvent from experiment 20,29 and the present simulations. For the asymmetric stretch fundamental [0, 0, 1] the vibrational blue shift from P (ω) and the line shape (asterisk) with respect to the gas phase values is also indicated. For INM this is a harmonic shift ∆ω and for all other cases it is an anharmonic shift ∆ν with respect to the fundamentals in the gas phase, see Table 1.

Mode
Expt To determine the peak frequencies of the distributions P (ω), the raw data was fitted to a log normal probability distribution, see Figure S3. The peak values of P (ω), are reported in Table 2 Table 2). For N − 3 in solution it is expected that P (ω) deviates somewhat from a Gaussian distribution. Because all fundamentals are shifted to the blue, it is also expected that P (ω) contains a tail at higher frequencies. This is indeed the case ( Figure S3), and the log-normal fits the raw data quite well which indicates that sampling is sufficient and the frequency trajectory is close to converged.
The blue shift from solving the 3-d Schrödinger equation is 64 cm −1 (according to P (ω)) and 74 cm −1 when considering the IR-lineshape, see Table 2 and Figure 6. This is in reasonable to good agreement with experiment (61 cm −1 ) and provides a meaningful basis for future applications of the present model in positionally resolved spectroscopy of peptides and proteins in solution. It is also worthwhile to note that the power spectrum, which is readily available from equilibrium MD simulations, yields a meaningful description of the blue shift (∆ν = 57 cm −1 ). In order to characterize the solvent dynamics around N − 3 in solution, the frequency correlation function for the NN asymmetric stretch from INM was determined. The data was fitted to Eq. 6 including three time scales, see Figure 7 and the corresponding fitting pa-rameters are reported in Table 3. Three time scales can be distinguished in the FFCF: two sub-picosecond time scales (τ 1 = 0.044 ps and τ 2 = 0.23 ps) with large amplitude and one on the picosecond (τ 3 = 1.18 ps) with a smaller amplitude.
Similarly, the FFCF for the asymmetric stretching mode was determined from the same 1.2 × 10 6 snapshots using DVR3D to solve the 3-dimensional nuclear Schrödinger equation based on scanning the 3-dimensional PES for an instantaneous solvent configuration. The FFCF determined from this time series was again fitted to Eq. 6 using three time scales, see Figure 7B. The corresponding fitting parameters are also reported in Table 3 Figure 6 shows the 1D-IR spectra of the NN asymmetric stretching mode using two different approaches (INM and DVR3D) to compute the frequencies. The observed and computed solvent induced shifts are 45 cm −1 and 74 cm −1 , respectively, for INM and DVR3D calculations, compared with 61 cm −1 from experiment, 20,27,28 see Table 2. They both correctly capture the blue shift but the quantum calculations appear to better describe the 1D-IR spectroscopy. A similar observation is made for the full widths at half maximum (FWHM). Independently, using a suitable flexible, spectroscopically accurate water model such as the Kumagai, Katwamura, Yokokawa (KKY) model [56][57][58] it will be possible to probe the coupling between the intramolecular degrees of freedom of solute and solvent as their fundamentals and overtones lead to potentially interesting dynamics. The present model can be further improved by using higher order multipoles [59][60][61][62] or polarization 63 to treat the electrostatic interactions. Also, a slight adjustment of the bonded interactions could be envisaged as using stretch and bending potentials from gas phase calculations for simulations in solution is another approximation that is used in the present work. This will be of particular importance when the present model is used for N − 3 as a covalently linked probe to a peptide or a protein residue.
Together with state-of-the art experiments the present work lays the foundation to a molecularly refined picture of the structural dynamics of complex systems in the condensed phase from a combined experimental/simulation approach.

Supporting Information Available
For Table S1, Figures S1 to S3 see supporting information. This material is available free of charge via the Internet at http://pubs.acs.org/.