Prediction of acidity in acetonitrile solution with COSMO‐RS

The COSMO‐RS method, a combination of the quantum chemical dielectric continuum solvation model COSMO with a statistical thermodynamics treatment for realistic solvation simulations, has been used for the prediction of pKa values in acetonitrile. For a variety of 93 organic acids, the directly calculated values of the free energies of dissociation in acetonitrile showed a very good correlation with the pKa values (r2 = 0.97) in acetonitrile, corresponding to a standard deviation of 1.38 pKa units. Thus, we have a prediction method for acetonitrile pKa with the intercept and the slope as the only adjusted parameters. Furthermore, the pKa values of CH acids yielding large anions with delocalized charge can be predicted with a rmse of 1.12 pKa units using the theoretical values of slope and intercept resulting in truly ab initio pKa prediction. In contrast to our previous findings on aqueous acidity predictions the slope of the experimental pKa versus theoretical ΔGdiss was found to match the theoretical value 1/RT ln (10) very well. The predictivity of the presented method is general and is not restricted to certain compound classes. However, a systematic correction of −7.5 kcal mol−1 is required for compounds that do not allow electron‐delocalization in the dissociated anion. The prediction model was tested on a diverse test set of 129 complex multifunctional compounds from various sources, reaching a root mean square deviation of 2.10 pKa units. © 2008 Wiley Periodicals, Inc. J Comput Chem, 2009


Introduction
Proton transfer is one of the fundamental processes in chemistry and biology. Thus, the understanding and the prediction of the thermodynamics of the proton transfer reaction and the dissociation constants of acids and bases in different solvents are of crucial importance in many areas of chemistry and biochemistry. Experimental measurement of aqueous phase pK a values nowadays has become an inexpensive standard application. 1 The same cannot be said about measurement of pK a values in nonaqueous solvents. In addition, there are broad classes of chemicals that are not readily amenable to experimental characterization (e.g. reaction intermediates, very strong, and very weak acids or bases with a pK a outside the ''natural'' pK a range that can be conveniently measured). Consequently, considerable effort has been devoted to develop first principle prediction methods for pK a values. Acetonitrile is a useful solvent for ionic reactions, including acid-base reactions. It has a high dielectric constant (e 5 36.0) 2 and thus favors dissociation of ion pairs into ions. At the same time it has low basicity and extremely low acidity resulting in a very low autoprotolysis constant 3 of pK auto ! 33. The low acidity also implies that acetonitrile has very low ability for specific solvation of anions. These properties put together make acetonitrile a very good differentiating solvent, especially for studies of acids. pK a measurements of acids and bases in acetonitrile date back to the classic works of the groups of Kolthoff 3,4 and Coetzee 2,5 in the 1960s. The pK a data in acetonitrile published up to 1990 have been gathered in the compilation of Izutsu. 6 During the recent decade spectrophotometric pK a scales of acids 7 and bases 8 both containing around hundred compounds and spanning for more than 20 orders of magnitude have been set up in acetonitrile. These are the most consistent datasets of pK a values currently available in acetonitrile.
The rapid development of efficient quantum chemical (QC) methods in the last years has opened new perspectives for the rigorous prediction of liquid phase pK a values. Of the different quantum chemical methodologies available for the computation of pK a values the dielectric continuum solvation methods (DCSMs 9 ) have become quite popular in the recent years [10][11][12][13][14][15][16][17] since they are able to describe accurately long range electrostatic interactions of solutes at moderate computational cost in the context of quantum chemical programs. Despite the well-known deficiencies of DCSM methods, (i.e. the neglect of hydrogen bonding and the inadequate treatment of the short range electrostatics, 10,[18][19][20][21] which can be much stronger in ions than in neutrals and thus can introduce a large asymmetry to the solvation energy of an acid compared to its conjugate base) it is possible to correlate the quantum chemical dissociation free energy of a solvated molecule DG diss with its pK a via a linear free energy relationship (LFER) 10 : From the basic thermodynamics c 1 is expected to be unity if DG diss would be calculated without a systematic error and the LFER axis intercept c 2 is expected to be equal to 2log [Solvent]. 22 Looking in detail into the DCSM studies, [10][11][12][13][14][15][16][17] in the regression of pK a values versus the calculated dissociation free energy DG diss the studies report slopes that are significantly lower than the theoretically expected value of 1/RT ln (10). Such a behavior has been reported for aqueous [10][11][12] and nonaqueous acids 10,[13][14][15]23 as well as for bases. 16,17,24 This drawback is common to all simple DCSMs unless considerable effort is taken in the (often physically hardly justifiable) adjustment of numerous additional and often physically doubtful parameters of the DCSM. Atom type or hybridization specific cavity radii and cavity definitions that depend on the charge of the molecule are examples of such parameters. 25 Although such models became quite popular and successful applications for nonaqueous solvents have been reported, [26][27][28] it remains doubtful if the predictive power of such empirical adjustments persists for more complex chemically multifunctional solutes or for solutes such as free radicals, zwitterions or excited states. 23 Quite some effort has been devoted to the computational prediction of pKa values in Acetonitrile. Most of the works have focused on computation of pK a values of cationic acids (protonated bases) and to the best of our knowledge all of them use experimental pK a data to achieve useful predictive power for their approaches. Moreover, the adjustment of these cavity specific parameters (and thus also the quantum chemical DCSM computation of the solute acid and conjugate base) has to be done anew for each new solvent considered, making this approach hardly practical or extensible.
To avoid such problems Chipman 23 proposed a DCSM on isodensity cavities, which claims to describe both cationic and neutral acids by a single correlation line between computational and experimental pK a values. There are, however, only six data points, which is too few and all the cationic acids included in the correlation have lower pK a values than any of the neutral acids. Furthermore, in refs. 7 and 29 new, more accurate, pK a values for acetic acid, benzoic acid, and phenol have been published, which are all higher (by up to 2 pK a units) than the earlier values used by Chipman. Substitution of the new values to the correlation leads to the increase of the rmse of the correlation from 0.3 to 0.6 pK a units. Thus, as admitted also by Chipman, too far-reaching conclusions should not be made. A related isodensity DCSM approach has been used by the Maksic group in number of computational acetonitrile pK a studies of bases. 30 Most of their works aim at (and achieve) highly accurate pK a predictions within groups of closely related compounds and therefore use experimental pK a values of structurally similar compounds to ''calibrate'' the computations, thus achieving rmse values down to 0.3 pK a units.
A promising approach to the pK a problem, which does not artificially modify the cavity to try and reproduce hydrogen bonding and short-range solute-solvent interaction behavior that is not accounted for by the DCSM, is the addition of explicit solvent molecules to the solute ions 31-34 : a solute anion is represented by a cluster of the anionic solute molecule with one or more surrounding solvent molecules that form a partial or full solvation shell around the ion, accounting for strong solute-solvent interactions in a physical way. Although this approach has the advantage that the slope of the aqueous pK a LFER is reported to be significantly closer to the theoretical slope compared to simple DCSMs, 31,32 its practical application leads to some ambiguities and problems, especially in the case of nonaqueous solvents: there is no natural choice of the number of solvent molecules that represent the solvent shell, retaining some level of arbitrariness involved, where a choice has to be taken. However, what in practice might turn out to be the much harder problem, is the optimization of the solute-solvent cluster. For complex, multifunctional solutes, as most chemically or biologically interesting drug-like compounds are, it is very difficult and computationally demanding to find the global minimum of the weakly bonded solute-solvent complex. If the solvent itself is a complex multifunctional compound, or if a mixture of several solvent compounds is used, it easily may become impossible to find the global minimum of the cluster at all. From these practical considerations, the computation of the large and complex data sets used below, the explicit solvation approach was outside the scope of this study. In addition, the explicit goal of the study was to provide a methodology that is very simple on the level of the quantum chemistry involved and that the solute compounds computed on the quantum chemistry level are ''transferable'', meaning that they can be used for pK a , predictions in other solvents or even solvent mixtures as well, without the need of recomputing them (as the modified cavity and the explicit solvation models would demand). Thus, we chose an approach different from the ones already mentioned: the Conductor-like Screening Model for Real Solvents (COSMO-RS).
COSMO-RS, [18][19][20][21] goes beyond the DCSM concept in that it combines the electrostatic advantages and the computational efficiency of the DCSM COSMO 35 with a statistical thermodynamics method for local interaction of surfaces, which takes into account local deviations from dielectric behavior as well as hydrogen bonding. In this approach, all information about sol-utes and solvents is extracted from initial QC-COSMO calculations, and only very few parameters have been adjusted to experimental values of partition coefficients and vapor pressures of a wide range of neutral organic compounds. COSMO-RS is capable of predicting partition coefficients, vapor pressures, and solvation free energies of neutral compounds with a root mean square error (rmse) of 0.3 log-units and better and a lot of experience has been gathered during the past years about its surprising ability to predict mixture thermodynamics. [18][19][20] Stimulated by the successful COSMO-RS predictions of aqueous acidity 10 and basicity 24 as well as some preliminary studies in nonaqueous solvents, 10 we decided to perform a systematic study on the ability of COSMO-RS to predict pK a values of acids in acetonitrile. For that purpose, we calculated DG diss for a broad selection of 93 organic acids in acetonitrile, spanning a pK a range between 3 and 27, and using the standard COSMO-RS method implemented in the COSMOtherm program 36 based on Turbomole DFT/COSMO calculations. [37][38][39] Theoretical Calculations Our theoretical calculations of DG diss of acids in acetonitrile are based on the reaction model Since we are not interested in the gas phase reaction, we directly calculated the free energy of each species in acetonitrile solutions. For that we first applied our standard procedure for COSMO-RS calculations to all four species appearing in eq. (2), which consists of two steps: 1. Full DFT geometry optimization with the Turbomole program package 39 using B-P density functional 40,41 with TZVP quality basis set using the RI approximation. 42 During these calculations the COSMO continuum solvation model was applied in the conductor limit (e 5 1). Element-specific default radii from the COSMO-RS parameterizations have been used for the COSMO cavity construction. 19,20 Such calculations end up with the self-consistent state of the solute in the presence of a virtual conductor, that surrounds the solute outside the cavity. 2. COSMO-RS calculations have been done using the COSMOtherm program. 36 In these calculations the deviations of the real solvent, in this case acetonitrile, compared to an ideal conductor are taken into account in a model of pair-wise interacting molecular surfaces. For this purpose, electrostatic energy differences and hydrogen bonding energies are quantified as functions of the local COSMO polarization charge densities r and r 0 of the two interacting surface pieces. The chemical potential differences arising from these interactions are evaluated using an exact statistical thermodynamics algorithm for independently pair-wise interacting surfaces, which is implemented in COSMOtherm. More detailed descriptions of the COSMO-RS method are given elsewhere. [18][19][20][21] If more than one conformation or different deprotonation sites were considered to be potentially relevant for the neutral or anionic form of the acid AH, several conformations were calculated in step 1 and a thermodynamic Boltzmann average over the total Gibbs free energies of the conformers was consistently calculated by the COSMOtherm program in step 2.
For all acids AH, the Gibbs free energy of dissociation (DG diss ) has been calculated as the difference of the total free energy of the anion A 2 and the neutral acid AH. To this free energy difference the free energy difference of CH 3 CNH 1 and CH 3 CN has been added as a constant contribution: From the calculation procedure described above, we get G tot (CH 3 CNH 1 ) 2 G tot (CH 3 CN) 5 253.48 kcal mol 21 . This value is in good agreement with literature estimates. 23,43 Zero point vibrational energies are not taken into account. Consequently, the geometries optimized in step 1 were not analyzed for the nature of the stationary point of the optimized geometry. We make the common assumption that the difference in zero point energy between the neutral and the deprotonated acid is generally small. 10 Moreover, we did not take into account the symmetric multiplicity factors of the compounds conformations, because we did not feel able to do this consistently for all kinds of acids in the same way.

Fit Data Set
For the purpose of finding the LFER coefficients of eq. (1), a data set of 93 acids in acetonitrile was used. The data were taken from ref. 7. The pK a values in the lower end of the scale (below pK a 5 9, i.e. starting from TosOH) of ref. 7 have been corrected downwards by 0.1 to 0.15 pK a units because we discovered an error in the data of ref. 7 in the region of pK a values 7 to 9. The reason for this is twofold: (a) in the region of pK a values from 7 to 9 there are only five compounds in the scale (resulting in a smaller number of overlapping DpK a measurements than in other parts of the scale) and even more importantly (b) three out of these five compounds (TosOH, 4-ClÀ ÀC 6 H 4 SO 3 H and C 6 H 5 CHTf 2 ) are inconvenient for measurements as they have not very suitable spectral properties and in addition TosOH and 4-ClÀ ÀC 6 H 4 SO 3 H undergo homoconjugation in MeCN, which, although taken into account, complicates measurements and reduces their accuracy. Because the scale is anchored to the pK a value of picric acid (pK a 5 11.0), the error in the region of pK a values 7 to 9 influenced the pK a values of all the acids that are stronger. The error was discovered by additional careful DpK a measurements. Although unfortunate, this shift in pK a values is quite small and has no influence in most applications. The pK a values range between 3 and 27. The dataset consists of (a) 23 OH acids, namely 5 sulfonic acids, 14 aromatic alcohols, 1 aliphatic alcohol, and 2 carboxylic acids; (b) 32 NH acids, namely 3 aromatic secondary amines, 1 aniline, 21 sulfonimides, and 7 carbonylsulfonimides and (c) 38 CH acids, namely 31 trisubstituted methanes, 6 fluorenes, and 1 cyclopentadiene. The results for all 93 acids in the fit data set are shown in Table 1. The regression of the calculated Gibbs free energy of dissociation (DG diss ) versus experimental pK a in acetonitrile is depicted in Figure 1.
Correlation of the complete fit data set results in a correlation coefficient of r 2 5 0.857. The regression equation for acids pK a in solvent acetonitrile reads pK a ¼ 1:06ðAE0:01Þ DG diss RT lnð10Þ À 5:6ðAE0:1Þ (4) The calculated axis intercept of 25.6 is in reasonable concordance with the theoretical value of c 2,ideal 5 2log[CH 3 CN] 5 21.28. If we would have omitted the free energy difference of CH 3 CNH 1 and CH 3 CN, which we calculate as 2253.48 kcal mol 21 , in the definition of DG diss we would have received a regression constant of cˆ2 5 191.6. In contrast to previous findings on aqueous acidity 10 and basicity 24 and dimethylsulfoxide acidity, 10 we found that the slope of the regression is close to the theoretical value of 1/RT ln (10). Application of eq. (4) to predict the pK a values of the fit set yields a rmse of 2.53 pK a units.
A closer look at the regression Table 1 and Figure 1 reveals that there are systematic deviations: the regression splits into two distinct groups with slightly different slopes (which both are close to the theoretical slope) and significantly different axis intercepts. This is an interesting behavior, which is not observed in the COSMO-RS-DG diss versus pK a correlations for solvent water (neither for acids 10 nor for bases 24 ) and acids in nonaqueous solvent dimethylsulfoxide. 10 Analysis of the electronic structure of the molecules involved suggests the presence of two groups of acids. The classification of the compounds to these groups is correlated with the level of charge delocalization in the anions. The anions with localized charges have strong interactions with solvent molecules, which results in strong influence of solvation on the pK a values. This influence is not fully taken into account by the calculations. At the same time, the acids (especially CH acids) that yield anions with delocalized charges are less affected by solvation and their acidities are better predicted.
If one compares the ab initio pK a values of the CH acids (calculated directly from eq. (1) using theoretical values of the c coefficients c 1,ideal 5 1 and c 2,ideal 5 2log[CH 3 CN] 5 21.28) to the experimental pK a values then it can be seen that the agreement is very good. Only the two acids that give the most charge-localized anions (Octafluorofluorene and (C 6 F 5 )CH (COOEt) 2 ) deviate by more than 2 pK a units. The rmse is 1.12 pK a units. If we exclude these two acids then we arrive at rmse 5 0.86 pK a units, which is excellent, keeping in mind that the pK a value are not adjusted in any way! All the acids dissociating from a carbon atom (CH acids) included in the dataset derive their acidity from an extensive charge delocalization that stabilizes the anion. The anionic centre is conjugated to one or more aromatic systems and those are substituted by electronegative (in most cases heavily: perfluorinated) or resonance acceptor groups. All CH acids with the exception of octafluorofluorene can be regarded as trisubstituted methanes. Octafluorofluorene can be regarded as a disubstituted methane and it is the most deviating point of the CH acid cloud. It is important to note that (C 6 F 5 )CH(COOEt) 2 , although formally a CH acid, is able to form a tautomeric structure, which has a planar central carbon atom and is protonated on one of the carbonyl oxygen atoms of the ester groups, thus being an OH acid. In addition, the proton is strongly chelated by the oxygen atom of the second carbonyl group, resulting in a stable 6-membered cycle. If this tautomeric equilibrium is taken into account in the computation of DG diss by means of pseudo conformer equilibrium of the tautomers in COSMO-RS, the regression of this compound neatly falls into the CH acids group. Because of the highly delocalized charge in the anions and thus low sensitivity to moisture and other ions in the solution (and also very suitable spectral properties) we rate the pK a values of CH acids as the most reliable of the three acid groups in the fit data set.
The acids dissociating from an oxygen atom (OH acids) have to be considered at greater detail. Most of them are phenols that are heavily substituted by electronegative and electron acceptor substituents (the least substituted one is 2,4,6-tribromophenol). Conjugation of the OH center with the aromatic system provides possibility for delocalization of the charge, although not nearly as efficient as in the CH acids group, due to the higher electronegativity of oxygen compared to carbon and due to the fact that just one substituent is attached to the oxygen compared to three substituents attached to carbon atom in the CH acids. For some of the phenols the possibility for delocalization of the charge in the anion is even further diminished by the steric hindrance of bulky electron-acceptor groups such as nitro (NO 2 ) or (to a lesser extent) trifluoromethanesulfonyl (Tf), which try to avoid contact and are bent out of the ring plane and thus fail to conjugate efficiently with the À ÀO 2 (deprotonated OH) center. Therefore, phenols form a distinct second cloud on the figure, lower than the CH acids cloud. There are 8 OH acids in the set that form anions with a localized charge. These are all 5 sulfonic acids, two carboxylic acids (benzoic acid and acetic acid) and perfluoro-tert-butyl alcohol. All sulfonic acids included here and benzoic acid do have an aromatic system. But these aromatic systems are not conjugated with the OH acidity centre, but are separated by an SO 2 or a CO fragment. Acetic acid and perfluoro-tert-butyl alcohol do not have an aromatic system. Consequently, all of these 8 acids are distinctly separated from the ''delocalized'' CH acids in the DG diss versus pK a plot and are also slightly lower than the cloud of substituted phenols Figure 1.
The acids dissociating from a nitrogen atom (NH acids) all are sulfonimides or carbonylsulfonamides, except four of them being aromatic amines. All the amides and imides are quite similar to sulfonic acids in that the charges in the anion are rather localized (although somewhat more delocalized than in sulfonic acids). Because of this it is not surprising that these acids form a joint group with sulfonic and carboxylic acids. The four aromatic amines have one or two substituted aromatic rings connected to the NH acidity center. These aromatic amines are a borderline case between the CH acids and OH acids with localized-charge anions: the charge delocalization is similar to that of phenols. Thus, it is not surprising that in Figure 1 they do not fit visually into the group of ''delocalized'' CH acids, and just like the phenols they do not fully fall into the group of the strictly ''localized'' acids like carboxylic or sulfonic acids. On the basis of the above considerations and in order to avoid too extensive splitting of the data set and considering that phenols and aromatic amines do not deviate strongly from the rest of the OH and NH acids we split it in two: CH acids giving anions with highly delocalized charges and all other acids that have less extensive delocalization of charge in their anions. The assignment of the compounds to these groups, formally called as ''delocalized'' and ''localized'' is given in the fifth column of Table 1. The regression of the experimental acetonitrile acid pK a values with the calculated values of DG diss was repeated independently for the two compound families.
There are 38 compounds in the fit data set that allow for delocalization of the charge over the molecule structure in their anionic form, all of which are CH acids (with the exception of compound (C 6 F 5 )CH(COOEt) 2 , as explained above). The pK a versus DG diss regression of this compound family results in a correlation coefficient of r 2 5 0.971. The regression equation for the pK a of acids forming charge-delocalized anions in solvent acetonitrile reads pK delocalized a ¼ 0:91ðAE0:01Þ DG diss RT lnð10Þ À 0:1ðAE0:1Þ The calculated axis intercept of 20.1 is in reasonable concordance with the theoretical value of c 2,ideal 5 2log[CH 3 CN] 5 21.28. If we would have omitted the free energy difference of CH 3 CNH 1 and CH 3 CN, which we calculate as 2253.48 kcal mol 21 , in the definition of DG diss we would have received a regression constant of cˆ2 5 169.9. Application of eq. (5) to predict the pK a of the family of ''delocalized anion'' compounds in : pK a value calculated by eq. (4); pK Calc a (corr) : pK a value calculated by eq. (7). b Tf denotes CF 3 À ÀSO À 2 ; Tos denotes 4-MeÀ ÀC 6 H 4 À ÀSO À 2 .
c Formal notation, see text. d Tautomeric equilibrium, see text.
the fit set yields a rmse of 0.91 pK a units. Only two acids octafluorofluorene and 9-C 6 F 5 -octafluorofluorene deviate by more than 2 pK a units and these are the acids that happen to have to strongest charge-localization in their anions. If these two acids are excluded from the regression we arrive at rmse 5 0.74 pK a units, which is excellent, keeping in mind that the pK a value are not adjusted in any way. Excluding these two compounds, the regression results in coefficients c 1 5 0.94, and c 2 5 20.53 with r 2 5 0.981. Both the c coefficients are now in better agreement with their theoretical values.
In the remaining 55 compounds of the fit data set the anionic charge can not be delocalized over the anion's structure. The pK a versus DG diss regression of this compound family results in a correlation coefficient of r 2 5 0.958. The regression equation for the pK a of acids forming charge-localized anions in solvent acetonitrile reads pK localized a ¼ 1:08ðAE0:01Þ DG diss RT lnð10Þ À 7:8ðAE0:1Þ Considering the typical accuracy of the underlying DFT method, the calculated axis intercept of 27.8 is still in reasonable concordance with the theoretical value. If we would have omitted the free energy difference of CH 3 CNH 1 and CH 3 CN, which we calculate as 2253.48 kcal mol 21 , in the definition of DG diss we would have received a regression constant of cˆ2 5 194.2. Application of eq. (6) to predict the pK a values of the family of ''localized anion'' compounds in the fit set yields a rmse of 1.38 pK a units. Six compounds deviate by more than 2 pK a -units from the regression line. Four of them are phenols with a large number of strongly electronegative substituents. As discussed above, the anionic charge of these compounds must be considered as partly delocalized and thus they show a systematic deviation from the regression line. The remaining outliers are acetic acid and (CF 3 ) 3 COH. Excluding these six compounds, the regression results in coefficients c 1 5 1.10, and c 2 5 28.91 with r 2 5 0.976 and we arrive at rmse 5 0.99 pK a units.
These results are at least in part the reason why in the earlier work 3 on acid pK a values in water and DMSO no splitting of the regression was observed: the dataset of ref. 10 did not include CH acids. The second reason may be that water is capable of solvating anions with very high efficiency, so that some effects that are visible in acetonitrile can be masked in water. This effect is also present, although less pronounced, in dimethylsulfoxide, which also has considerably stronger anion-solvating abilities than acetonitrile.
A major goal of this report is to provide a simple and practical prediction methodology for pK a values of acids in acetonitrile. The systematic deviations observed above lead us to the conclusion that a simple heuristic correction to DG diss that accounts for the different behaviour of acids giving anions of different level of charge delocalization, should lead to an improved correlation as well as to a simple and practical LFER method in eq. (1). From the separate regressions of ''delocalized'' and ''localized'' anion acids in eqs. (5) and (6), it can be concluded that the significant difference is the axis intercept of the regression, not the slope. Thus, the addition of a simple shift value will be sufficient. If DG diss for compounds with localized anions is corrected by a value of 27.5 kcal mol 21 , while the ''delocalized'' anion compounds remain untouched, the linear regression for the experimental acetonitrile pK a with the corrected calculated values of DG diss results in a correlation coefficient of r 2 5 0.957. The regression equation with the thus corrected DG diss reads: pK a ¼ 0:92ðAE0:01Þ DG diss RT lnð10Þ À 0:1ðAE0:1Þ (7) Application of eq. (7) to predict the pK a of the complete fit data set yields a rmse of 1.38 pK a units. The calculated results are listed in the ninth column of Table 1. The strong outliers of the prediction with eq. (7) are the same as for the separate fits above. If they are removed from the fit set the regression results in coefficients c 1 5 0.95, and c 2 5 20.53 with r 2 5 0.970 and we arrive at rmse 5 1.13 pK a units. It is interesting to note that many of the ''borderline'' compounds (with respect to charge delocalization in anions) give a better fit with eq. (4) than with eq. (7).

Test Data Set
To be able to get an independent test of the pK a prediction methods deployed, literature data for 129 compound acidities in acetonitrile was collected. The bulk of the test set data was taken from the review book of Izutsu. 6 From the 102 acid solutes in Izutsu's collection 100 were used in the test data set. Two compounds from Izutsu's set (acetic acid and benzoic acid) were used in the fit data set already and thus excluded from the test set. The remaining 29 test data pK a values were taken from   It thus should be a challenging trial for the predictive qualities of the methodology developed in the previous section. Formal assignment of the acids as ''delocalized'' and ''localized'' was done on the basis of the considerations outlined above. A special case is formed by dicarboxylic acids forming stable intramolecular hydrogen bonds in their monoanions. In these anions, there is efficient delocalization of charge across the formed cyclic structure and these were assigned into the group of acids with charge-delocalized anions. In addition, thiophenols, perchloric, and fluorosulfuric acid were also assigned to the same group based on the analysis of the anions COSMO surfaces. The prediction results for all 129 acids in the test data set are listed in Table 2 and depicted in Figure 2. pK a predictions using the LFER parameters of the uncorrected (raw) fit of eq. (4) are given in column 8 of Table 2. The test data set is predicted with a mean signed error of 21.32 pK a units, and a rmse of 3.63 pK a units. If DG diss values of acids giving charge-localized anions is corrected by a value of 27.5 kcal mol 21 and the according LFER parameters of the ''corrected'' fit of eq. (7) is used to predict the pK a of the test data set, the mean signed error of this data set reduces to 20.04 pK a units, and the rmse reduces to 2.10 pK a units. Taking into account the uncertainties of the experimental data and the diversity of the data sources, this prediction quality is satisfactory. : pK a value calculated by eq. (4); pK Calc a (corr) : pK a value calculated by eq. (7). Figure 2. Test data set. Calculated vs. experimental acids pK a in solvent acetonitrile. Open circle: pK a calculated by eq. (4). Filled circle: pK a calculated by eq. (7).

Conclusions
A computational method for the computational quantum chemical prediction of the acidity of organic and inorganic acids in solvent acetonitrile has been deployed. Acetonitrile pK a values of acids were predicted via a thermodynamic cycle, utilizing Gibbs free energies of dissociation in acetonitrile solution as computed by the COSMO-RS theory on the basis of quantum chemical DFT/COSMO calculations. Without any special adjustments of radii or other parameters this led to a prediction model for acid pK a values in acetonitrile. In contrast to our findings on aqueous acidity predictions 10 the slope of the experimental pK a versus theoretical DG diss was found to match the theoretical value 1/RT ln(10) well. No unique linear free energy relationship between the calculated Gibbs free energy and the experimental acids pK a values was found. Instead, the linear free energy relationship splits into two major acid groups. The affiliation of acids to these families is based on the degree of localization of charge in the anion produced on acid dissociation. For acids with strongly delocalized charges in the anions both slope and axis intercept of the linear free energy relationship are very close to their theoretical value thus allowing for direct ab initio prediction without intermediate LFER correlation. The rmse of the acids with strongly delocalized charges in the anion predicted by the theoretical values for both slope and axis intercept of the linear free energy relationship is 1.12 pK a units compared to 0.91 pK a units that are achieved by fitting the LFER parameters. For acids with weakly delocalized or localized charges in the anion the slope of the linear free energy relationship also is very close to its theoretical value, but the axis intercept differs by about 27.5 kcal mol 21 . For these compounds, a LFER correlation based prediction is possible with good quality. From the given considerations, it is possible to unify the prediction for both families of compounds into one practical prediction methodology, which applies a correction term for the free energy of dissociation.
The prediction of pK a of acids and bases in solvents water 10,24 and dimethylsulfoxide 10 differs from the current findings in solvent acetonitrile in two aspects: first, no partitioning into groups was observed, and second, the slope of the pK a versus DG diss regression was significantly lower than the theoretical value. The first difference is at least in part caused by the absence of CH acids in the dataset of ref. 10. An additional reason is that water is capable of solvating anions with very high efficiency, so that some effects that are visible in acetonitrile can be masked in water. The same, although in a less pronounced way, holds for solvent dimethylsulfoxide, which also has considerably stronger anion-solvating abilities than acetonitrile. This suggests that both findings are, at least in part, related to the capability of the solvent to solvate and thus stabilize the anions, which is not captured sufficiently by the quantum mechanical method used. This is further corroborated by recent reports claiming that the addition of explicit solvent molecules to the continuum solvation model calculations of aqueous pK a results in a slope of the pK a vs. DG diss regression, which is very close to the expected theoretical slope. 31 The results of this work also demonstrate that urgent and at present yet not satisfied need exists for reliable experimental data of physicochemical parameters in order to develop and validate computational approaches for their prediction.