Uncertainties associated with theoretically calculated N2-broadened half-widths of H2O lines

With different choices of the cut-offs used in theoretical calculations, we have carried out extensive numerical calculations of the N2-broadend Lorentzian half-widths of the H2O lines using the modified Robert–Bonamy formalism. Based on these results, we are able to thoroughly check for convergence. We find that, with the low-order cut-offs commonly used in the literature, one is able to obtain converged values only for lines with large half-widths. Conversely, for lines with small half-widths, much higher cut-offs are necessary to guarantee convergence. We also analyse the uncertainties associated with calculated half-widths, and these are correlated as above. In general, the smaller the half-widths, the poorer the convergence and the larger the uncertainty associated with them. For convenience, one can divide all H2O lines into three categories, large, intermediate, and small, according to their half-width values. One can use this division to judge whether the calculated half-widths are converged or not, based on the cut-offs used, and also to estimate how large their uncertainties are. We conclude that with the current Robert–Bonamy formalism, for lines in category 1 one can achieve the accuracy requirement set by HITRAN, whereas for lines in category 3, it is impossible to meet this goal.


Introduction
The modelling of the atmosphere from satellite-based, balloon-based, and Earth-based instruments requires an accurate spectroscopic database such as HITRAN [1,2]. This widely-used database has spectral line data for the most important molecules in bands from the microwave to the ultraviolet spectral regions. Some of the data for the water vapor lines are experimental, but because of the large number of lines and the ambient temperature range in different parts of the atmosphere, many have to be calculated theoretically. The accuracy of these calculated values depends on many factors such as the line-shape theory used, the sophistication and accuracy of the intermolecular potential model, the need to obtain converged results, etc. For instance, for accurate atmospheric applications involving the very important water vapor molecule, it is desirable to know the half-widths and their temperature dependence (T dependence) to better than 3% uncertainty for strong lines and 10% for weaker lines [2]. Unfortunately, of all the spectroscopic parameters in HITRAN for H 2 O, the pressurebroadened self-and air half-widths and their T dependences contain the largest uncertainties. Recently, in order to improve this situation by combining all available data, both experimental measurements and theoretical calculations, the H 2 O database in HITRAN was updated [2]. In developing this updated version in 2006, in cases where there were no half-width values to the desired accuracy available in the vast experimental literature, theoretically calculated values were adopted. The latter were derived based on the Complex Robert-Bonamy (CRB) lineshape theory and potential models consisting of a short-range site-site interaction and long-range interactions by Gamache and collaborators [3][4][5][6]. This new database represents our cutting-edge knowledge of these parameters. Since then, attempts to refine the theoretical calculations have continued [7].
The Robert-Bonamy (RB) formalism was developed more than three decades ago [8]. In comparison with previous formalisms, the main features of this theory are the non-perturbative treatment of the S scattering matrix though the use of the linked cluster theorem, and a convenient description of classical trajectories for large impact parameters as well as for the closest approach. However, the RB formalism contains several basic assumptions whose applicability was not thoroughly verified. Since then, the formalism has been widely used in calculating the Lorentzian half-widths and shifts for spectral lines for many molecular systems. Some progress has been made recently to improve the original version, but most of these efforts have been related to the extension to more complicated potentials, vibrational dependences, more accurate trajectories, etc. None has focused on making further improvements to its core part. As a result, calculations using the RB formalism have become routine. In cases where the calculated values did not match the experimental data well, the strategy commonly used for improvements was to tune the parameters, rather than to scrutinize the RB formalism itself. Given the fact that the current RB formalism remains almost the same as 30 years ago, one can ask if this theory can meet the cutting-edge requirement of HITRAN? Furthermore, in order to make the calculations tractable with potentials including short-range site-site models, one usually introduces two cut-offs to limit the number of terms considered, although until now no thorough convergence checks have been made. Thus, it has become necessary to scrutinize current theoretical calculations to address this problem.
In Gamache's research work, both the potential models used and their final results are available [7]. Thus, it is possible for others to make independent verifications. In the present study, we repeat and extend their calculations beyond their choices for cutoffs. Based on these calculations, we find that one can conveniently divide the H 2 O lines into three categories according to the values of the half-widths. It turns out that this division coincides not only with a division of lines according to their convergence behaviour, but also with a division according to the uncertainties associated with the calculated values. In general, the smaller the half-widths, the poorer the convergence and the larger the uncertainty associated with them. For lines with large half-width values, by using a realistic potential model one is able to use the RB formalism and meet the accuracy requirement set by HITRAN, but for lines with small half-widths, the RB formalism definitely does not meet this requirement.
2. General formalism of the line half-width 2.1. Brief outline of the RB formalism and convergence challenges Within the modified RB formalism where a subtle error in deriving the original RB formalism has been corrected [9], the expression for the half-widths is given by where S 1 and S 2 are the first and second orders of the perturbation expansion of the Liouville operatorŜ (¼ S I Á S Ã F , where S I and S F are scattering matrices in Hilbert space). In the above expression, f() is the Maxwell-Boltzmann distribution function, " is the average velocity, r c is the closest distance of approach along the trajectories, r c,min is the minimum of r c corresponding to strictly head-on collisions, and the 'apparent' velocity v 0 c is defined as [8] v 0 where and " are two parameters of the LJ model used in modelling the isotropic interaction potential. Usually, S 2 is represented by three terms labelled S 2,outer,i , S 2,outer,f , and S 2,middle , respectively. In the present study, we follow the same custom, and as an example we outline the main features in calculating S 2,outer,i . The original expression for S 2,outer,i is given by where the quantum numbers j i and i specify the energy levels of the initial states of the H 2

O line of interest and the Hilbert interaction operatorVðRðtÞÞ is defined bŷ
VðRðtÞÞ e iðH a þH b Þt= h VðRðtÞÞe ÀiðH a þH b Þt= h : Usually, in order to evaluate the potential matrix elements in Equation (3), one prefers to express VðRðtÞÞ in terms of the standard spherical expansions In the present study, we adopt potential models consisting of a long-range dipole-quadrupole and quadrupole-quadrupole interactions V dq ð 1 , 2 , RÞ þ V qq ð 1 , 2 , RÞ and a short-range interaction V atomÀatom ð 1 , 2 , RÞ modelled by the site-site model. For V dq and V qq , their spherical expansions are well known and numbers of their expansion terms are very limited. For V dq , the expansion contains only one term (i.e. L 1 ¼ 1, K 1 ¼ 0, L 2 ¼ 2, and L ¼ 3) which varies with R as R À4 . For V qq it has three terms (i.e. L 1 ¼ 1, K 1 ¼ 0, AE2, L 2 ¼ 2, and L ¼ 3) which vary as R À5 .
With respect to V atomÀatom , the site-site model expression is given by where ij and " ij are parameters and r ij are distances between the ith atom of the absorber molecule a (i.e. H 2 O) and the jth atom of the bath molecule b (i.e. N 2 ). In order to evaluate its matrix elements, one needs to rewrite V atomÀatom in terms of the standard spherical expansions V atomÀatom ð 1 , 2 , RðtÞÞ ¼ X where n {ij} runs over all pairs of atoms in Equation (6), q ¼ 6 or 12, w is an integer index from 0 to infinity, and the definitions for U(L 1 K 1 L 2 L,n {ij} ,wq) can be found in the literature. It turns out that with the exception that L 1 þ L 2 þ L and L 2 must be even, and the well-known requirements from the Clebsch-Gordan coefficients and the rotational Wigner matrices, no other restrictions are enforced. Although the choices of q and n {ij} are limited, the number of combinations of q and n {ij} with all possible values of w are infinite. Therefore, one must introduce cut-offs. There are two kinds of cut-offs associated with the summation indices in Equation (7). The first is a cutoff to limit the set of irreducible tensor rank L 1 plus a subsidiary K 1 , and ranks L 2 and L needed in the expansions. For example, one can choose 2 as the upper limits for both L 1 and L 2 . By enforcing these restrictions for L 1 and L 2 , the selections for the two remaining L and K 1 are also limited. The second kind of cut-off is to set an upper limit for the index w. Thus, the possible choices of the combinations of q, n {ij} , and w become finite. In the literature, if one chooses 8 as the maximum of 2w, the cut-off is said to be eighth order. Usually, justification for the usage of the selected cut-offs can be made by an estimation of how small the magnitudes of the correction terms in the spherical expansion of V atomÀatom would be in comparison with the leading terms. Thus, it appears that the expansion problem can easily be solved.
Unfortunately, by taking these steps the problem does not go away because in evaluating S 2,outer,i one needs to consider the products of two potential operators. For later convenience, we add subscript V tot to emphasize that it is the total potential (i.e. V dq þ V qq þ V atomÀatom ). Indeed, consideration of the products of two potentials has a large impact on deciding if the cut-offs are adequate. The obvious impact is that the total number of summation terms for the products of two V tot is the square of those for an individual V tot . We know that the number of terms of V tot is mainly determined by the number of terms of V atomÀatom . Thus, if the number of terms for V atomÀatom is large, the total number for the product of two V tot is much larger.
The second impact is more subtle than the first and results only from the second kind of cut-off. Due to different symmetries, there are no couplings between two potential terms whose irreducible tensor ranks are not identical. As a result, it is certain that weak terms ignored in evaluating V atomÀatom by the first cut-off can only make small contributions to S 2,outer,i . Then, a criterion established by checking the convergence of V atomÀatom for this cut-off can be, more or less, applicable for S 2,outer,i . In contrast, with respect to the second kind of cut-off, there are couplings between terms ignored by the cut-off in evaluating V atomÀatom and terms considered with the same tensor ranks in V tot . We would like to emphasize here that the latter consists not only of the terms remaining in the spherical expansion of V atomÀatom , but also components labelled by the same tensor ranks in the expansions of V dq and V qq . As a result, to ignore weak terms in V atomÀatom using the second kind of cutoff could cause significant errors because these could make significant contributions to S 2,outer,i through couplings with other remaining strong components. Thus, one can conclude that it becomes more difficult to establish a reliable convergence criterion because the smallness of the terms in V atomÀatom ignored by the second kind of cut-off itself is not adequate to justify the convergence of S 2,outer,i .
Above, we have outlined the usual method used to calculate S 2,outer,i . A drawback of this method is that one could encounter convergence problems when sitesite potential models are considered. In these cases, to adopt higher cut-offs in the spherical expansion and to consider all contributions, including couplings between terms with the same categories of tensor ranks whose origins could be one of any type (e.g., long-range electrostatic, long-range induction, short-range site-site interactions, etc.), requires much tedious work and numerical evaluations of a large number of resonance functions, so that one may not be able to obtain converged results.
In summary, the convergence over these two cutoffs could become a formidable obstacle in practical calculations of Re(S 2 ) for molecular pairs, except for large separation distances where the electrostatic interaction is overwhelmingly dominant. Observant readers may have noticed that, at this stage, we have avoided the use of the phrase ''convergence of calculated half-widths''. Although half-widths are determined by calculating Re(S 2 ), the convergence of Re(S 2 ) and the convergence of half-widths is not always the same. We will return to this subject later. In conclusion, most previous theoretical calculations have been carried out using low-order cut-offs and their accuracy could be limited by convergence problems, thus a better way to deal with this problem is needed.

Coordinate representation and a new formalism
The possibility of convergence errors forces one to question if this method is the best way to proceed. Based on our experience in dealing with complicated potential models in treating far-wing line shapes and other problems [10,11], we know that the coordinate representation used in those studies has advantages when dealing with complicated potentials.
With the standard method, the basis set in Hilbert space is constructed from j j i i m i i ji 2 m 2 i, the product of the internal states H 2 O and N 2 . On the other hand, instead of this choice, one can select the orientations of the pair of molecules as the basis set in Hilbert space, i.e. jð 1 À 1 Þi jð 2 À 2 Þi, where 1 and 2 represent the orientations of H 2 O and N 2 specified by , respectively. By introducing the coordinate representation, the potential becomes a diagonal operator and the matrix elements become multi-dimensional integrations [10,11]. In the standard method, the functions required to be evaluated are the resonance functions and their total number depends on how many combination choices of summation indices L 1 , K 1 , L 2 , L, and q, n (ij) , w are used in Equation (7) with the restrictions enforced by the two kinds of cut-offs. As these two cut-offs increase, the number of resonance functions increases very quickly and to evaluate all these resonance functions becomes formidable.
On the other hand, with the new method, the functions required to be evaluated are the correlation functions defined later and their total number is determined only by the number of choices of the tensor ranks L 1 , K 1 , L 2 , and L restricted by the first kind of cut-off. This means that no matter how high the second kind of cut-off is, the number of correlation functions remains unchanged. In addition, because V atomÀatom become ordinary functions, one can choose any order cut-off one wants. In fact, the second kind of cut-off affects only how accurately one needs to evaluate the coefficients u(L 1 L 2 L; K 1 ; R(t)) in Equation (5). No matter how high this cut-off goes in calculating these coefficients, there are only small differences, determined by the required computational resources. As a result, we can use both cut-offs that are sufficiently high to guarantee convergence in practical calculations.
We have developed a new formalism to calculate half-widths for a simpler system consisting of two linear molecules (i.e. the N 2 -N 2 pair) [12]. In the present study, we make an extension to the H 2 O-N 2 pair. In this case, S 2,outer,i can be expressed as where ji is shorthand notation for the basis set of the molecular pair in the coordinate representation and the subscript of V represents the potential evaluated at a specified orientation labelled by 1 and 2 . The inner products h j i i m i i 2 m i2 ji represent a transformation between two basis sets of these two representations and are nothing but products of the complex conjugates of the H 2 O wave functions at the orientation 1 and the complex conjugates of the N 2 wave functions at the orientation 2 .
In order to simplify Equation (8), we introduce the correlation function, which is defined as a convolution integration where G L 1 K 1 K 0 1 L 2 ðt, t 0 Þ is given by In Equation (10), the factor ðÀ1Þ ðL 1 þL 2 þLÞ ¼ 1 because the summation index L must satisfy L 1 þ L 2 þ L ¼ even and Â tt 0 are the angles between the two vectors RðtÞ and Rðt 0 Þ. The set used to label the correlation functions consist of one tensor rank L 1 with two subsidiary indices K 1 , K 0 1 related to H 2 O and another tensor rank L 2 for N 2 . Because N 2 is a diatomic molecule, L 2 must be even. Thus, the number of sets is determined by the upper limits of L 1 and L 2 . If one chooses the II R representation to develop the H 2 O wave functions where two H atoms are symmetrically located in the molecular-fixed frame, the values of K 1 and K 0 1 must also be even. Then, it is easy to show that if one sets 2 as the maxima for both L 1 and L 2 , the number of sets is 20. If one increases the maximum for L 1 from 2 to 3 or 4, the number of sets increases from 20 to 38 or 88, respectively. Finally, by setting 4 as the maxima for both L 1 and L 2 , the number becomes 132. We do not present the assignments of L 1 , K 1 , K 0 1 and L 2 for the correlation functions here. Interested readers can contact the corresponding author by email to obtain the detailed assignments.
As shown by Equations (9) and (10), the correlation functions F L 1 K 1 K 0 1 L 2 ðtÞ are convolution integrations over t 0 from À1 to 1 whose integrands are proportional to the products of uðL 1 L 2 L; K 1 ; Rðt 0 þ t=2ÞÞ and uðL 1 L 2 L; K 0 1 ; Rðt 0 À t=2ÞÞ. The latter are the products of the radial expansion components of potentials V ðRðt 0 þ t=2ÞÞ and V ðRðt 0 À t=2ÞÞ appearing in Equation (8). In addition, the integrands of the convolutions also depend on the angles between the two distance vectors Rðt 0 þ t=2Þ and Rðt 0 À t=2Þ, which are arguments of V and V , respectively. In order to determine the physical meaning of the correlation functions, let us consider how the convolution integration over t 0 given by Equation (9) is carried out. One can imagine that as the integration variable t 0 varies from À1 to þ1, both Rðt 0 þ t=2Þ and Rðt 0 À t=2Þ move along the same trajectory. One distance vector follows its predecessor's movement and always keeps its delay time exactly all the way. The delay time is nothing but the time displacement t, the argument of the correlations. We note that the products of V ðRðt 0 þ t=2ÞÞ and V ðRðt 0 À t=2ÞÞ represent their overlaps. It is obvious that as the delay time t increases, the separations between V and V increase and their overlap decreases. Therefore, the correlation functions represent how the total overlaps between the corresponding expansion components of VðRðt 0 þ t=2ÞÞ and VðRðt 0 À t=2ÞÞ accumulate over a whole trajectory change with time displacement t.
Then, in terms of the correlation functions, one is able to rewrite Equation (3) as In deriving the above expression, two functions that are independent of the potential are defined by W ðaÞ where D(ii 0 0 ; LK ) is defined by and W ðbÞ L 2 ðtÞ ¼ respectively. Finally, by introducing the Fourier transform of the correlation functions H L 1 K 1 K 0 1 L 2 ð!Þ defined by one is able to obtain an expression for the real part of S 2,outer,i , Note that F L 1 K 1 K 0 1 L 2 ðtÞ, H L 1 K 1 K 0 1 L 2 ð!Þ, and Re(S 2 ) are associated with a specified trajectory. As a result, they all depend on r c . For simplicity, we have omitted the argument r c from the expressions.

Trajectories
Within the RB formalism, the trajectory model and the potential model are the two main factors governing the calculated results. We now briefly outline some trajectory models. When two molecules collide, both their isotropic and anisotropic interactions play a role in determining the trajectory along which their translational coordinates vary with time. However, considering the effects from the anisotropic interactions is a formidable problem, and has not been treated except for the work of Green [13] based on close-coupling scattering theory applicable only for simple systems usually consisting of one atom and one molecule. Therefore, we emphasize that there is a basic assumption that the trajectory is governed by the isotropic potential only. Of course, this is not true and this limitation is a large drawback suffered by most theoretical width and shift calculations.
Since Robert and Bonamy developed the RB formalism, there have been two trajectory models [8] commonly used in calculations. We call them the 'parabolic' and the 'modified parabolic' trajectories here. The first assumes that the trajectory labelled by r c is a straight path, but the modulus R(t) is given by where v 0 c is defined by Equation (2) and we have set t ¼ 0 when the translational motion reaches the closest distance r c along the trajectory. This implies that the translational motion moves with an 'apparent' velocity. The second is a modified version in which the straight line is replaced by a curve, defined by where v c is the velocity at R ¼ r c and can be calculated using However, it seems that one cannot make this replacement everywhere because sin (t) could be larger than 1 as t increases for some trajectories for which v c 4 v 0 c . This implies that one has to manipulate it further, more or less arbitrarily. Different researchers use different manipulation techniques and usually they do not provide detailed instructions about their method. Therefore, in order to avoid arbitrary effects resulting from adopting the 'modified parabolic' trajectory model, we only consider the original 'parabolic' model in the present study.
More recently, there have been attempts by Buldyreva et al. to consider the 'exact' trajectory [14,15]. Their work is based on Bykov's method presented 18 years ago, but no computation of line widths was performed at that time [16]. The basic idea of Bykov is that by changing the integration variable from the time t to the distance r when calculating resonance functions, the trajectory dependence of the resonance functions can be explicitly taken into account. It turns out that by introducing the coordinate representation, the 'exact' trajectory model can be more straightforwardly incorporated into the RB formalism. As shown by Equation (10), when one calculates , one needs to know howRðtÞ varies with time t along the trajectory. Because RðtÞ is no longer an operator, taking its trajectory dependence into account is more natural here. Interested readers can find our detailed explanation on the 'exact' trajectory model in Ref. [12].
It is well known that different collision trajectories have different closest distances r c and the strictly headon collision has the smallest r c,min . After selecting the trajectory model and the potential, one can easily find values of r c,min for temperatures of interest. For example, with the 'parabolic' trajectory model, r c,min can be derived directly from the expression For the 'exact' trajectory governed by V iso (R), the value of r c,min can be derived numerically from the equation With the potential model used in the present study, the value of r c,min at T ¼ 296 K derived from the 'parabolic' trajectory model with the 20th-order cut-off is 3.43564 Å . If one adopts the eighth-order cut-off, r c,min becomes 3.42070 Å . The value of r c,min is 3.52242 Å for the 'exact' trajectory model with the 20th-order cut-off. The value of r c,min is very important when calculating half-widths because it serves as the lower integration limit of Equation (1). In addition, when one investigates profiles of potential models, one can use r c,min as the lower boundary of the distance R.
Of course, r c and R are two different concepts. The former is used to label or to represent the whole collision trajectory and the latter is the argument of the potential and can vary from r c to infinity along the trajectory. However, for the present study, there is an important link between them: for a specified r c , all things of interest to us along this trajectory are independent of the behaviour of the potential V(R) for R5r c . As a result, one only needs to investigate profiles of the potential in the range R4r c . In addition, it turns out that when calculating N 2 -broadened halfwidths of the H 2 O lines, the potential ranges of most interest are those R starting from the closest distance r c of collisions to a little bit beyond. Therefore, we will pay extra attention to profiles of the potential in small R ranges.
As an example, we present in Figure 1 several typical 'parabolic' and 'exact' trajectories for H 2 O-N 2 at T ¼ 296 K. For the 'exact' model, three nearly headon trajectories associated with r c ¼ 3.55, 3.60, and 3.80 Å , respectively, and three glancing trajectories with r c ¼ 4.3, 5.0, and 6.0 Å are shown. Because the repulsive force is dominant for the former, the first three curves bend away from the scattering centre (at the origin of the plane). In contrast, due to the dominant attractive force, the last three curves bend towards the scattering centre. In addition, there is a curve with r c ¼ 4.0 Å in between these two groups. The corresponding seven 'parabolic' trajectories are also shown. They are horizontal straight lines with r c values as their ordinates. In addition, by adding small solid circles to the curve and small empty circles to the straight lines, we explicitly show how fast the relative motion of the pair moves along the trajectories. The travel time from one circle to the next is 0.05 ps. Showing this feature is important because one has to rely on both this and the trajectory to know how RðtÞ varies with time. Observant readers can see that the relative motion moves with higher speed along the 'parabolic' path than the 'exact' trajectory, which bends away from the centre. The higher 'apparent' velocity v 0 c applicable in the straight path is required to compensate for the bending effect in the 'exact' trajectories such that the increase of the modulus R(t) can, more or less, maintain a similar pace as the latter. For the same reason, it moves at a slower speed along the 'parabolic' path than the 'exact' trajectory, which bends towards the centre.

Profiles of the short-range and long-range parts of the potentials
In the present study, we adopt the most updated potential model of Gamache and Laraia [7], which consists of the short-range interaction V atomÀatom ð 1 , 2 , RÞ and the long-range dipole-quadrupole and quadrupole-quadrupole interactions V dq ð 1 , 2 , RÞ þ V qq ð 1 , 2 , RÞ. Generally speaking, our knowledge of the long-range interactions is better than for the short-range interactions. In fact, most recent attempts to refine the potential models are mainly made for the latter.
We present two simple figures to depict the profiles of this potential. First, by choosing R to lie along the z axis of the space-fixed frame, one can replace the vector R by its magnitude R. Because these interactions also depend on the orientations of H 2 O and N 2 , it is impossible to depict them well in a two-dimensional plot. However, by plotting their maxima and minima as functions of R we can determine the ranges of their magnitudes. Using the Monte Carlo method to choose enough random selections of the orientations of the H 2 O and N 2 pair, one can easily find the maxima and minima of V atomÀatom (V 1 , V 2 , R) and . By averaging the former's values over all these random selections of the orientation, one can obtain the corresponding isotropic component. We note that, in general, the isotropic part of V atomÀatom (V 1 , V 2 , R) does not match the LJ isotropic model exactly. However, when one calculates half-widths with the 'parabolic' trajectory model, one usually assumes that the isotropic interaction can be represented by the LJ model. In this case, one needs to find the LJ parameters and "/k B from a fitting procedure. In the present study, we have found that ¼ 3.83 Å and "/k B ¼ 53.4 K is the optimal choice for fitting the isotropic part of V atomÀatom (V 1 , V 2 , R). We note that these values differ slightly from those (see Table 3 of Ref. [7]) derived by Gamache and Laraia. The reason for this is that their values are obtained by adopting the eighth-order cut-off and 20 correlations, whereas ours are from higher cut-offs and are already converged. On the other hand, if one uses the 'exact' trajectory model in calculations, it is better to use the isotropic potential derived directly without using the second-hand LJ model. With respect to V dq (V 1 , V 2 , R) and V qq (V 1 , V 2 , R), it is well known that they are purely anisotropic.
We present these results in Figure 2. As shown in the figure, V atomÀatom (V 1 , V 2 , R) can have very large repulsive forces at short distances. Beyond R ¼ 4.3 Å , the maximum and minimum of V atomÀatom (V 1 , V 2 , R) are well within the range of the maximum and . This is consistent with the general impression that V atomÀatom (V 1 , V 2 , R) is the major interaction in the short distance region and V dq (V 1 , V 2 , R) þ V qq (V 1 , V 2 , R) play a dominant role in the long distance region. One can see that, in Figure 1, the distance at which the isotropic potential changes sign and also its maximum depth match the values of and "/k B given above. However, Figure 2 alone does not provide enough information to judge between V atomÀatom (V 1 , V 2 , R) and V dq (V 1 , V 2 , R) þ V qq (V 1 , V 2 , R), which is the major component in the range r c,min 5R54.3 Å .
They are plotted as the two solid black lines and the two dot-dashed red lines, respectively. The isotropic part of V atomÀatom (V 1 , V 2 , R) is also plotted as the dotted green curve.
, this does not necessarily mean that the former are more likely to dominate the latter. The reason for this is that these maxima only represent extreme events occurring in special orientations.
In practice, the importance of an event depends not only on the event itself, but also on how often it happens. Thus, we present two subsidiary figures to provide more in-depth analyses. We assume that both the H 2 O and N 2 molecules rotate freely in space. Then, in order to provide quantitative measures with respect to all possible combinations of the orientations V 1 and V 2 , how often specified values of V atomÀatom (V 1 , V 2 , R) occur, we calculate the probabilities with which V atomÀatom (V 1 , V 2 , R) take these values. The calculations are carried out with 10 8 random selections of the orientations of H 2 O and N 2 with a 0.5 cm À1 resolution for the specified values V 0 . For example, the probability of We also calculate the probabilities for We present these probabilities associated with four different distances, R ¼ 3.  Figure 2. By comparing the distributions at different distances, one finds that, as R increases, both the heights and widths of the peak increase and the right wings decrease quickly. Based on these features, one can conclude that V atomÀatom (V 1 , V 2 , R) more likely only takes small negative values within the wider ranges predicted by Figure 2. In addition, as R increases, there are fewer and fewer chances of observing large magnitudes. In contrast, the probability distributions for are almost symmetric and have flat plateaus with relatively narrow edges. As R increases, the plateau becomes higher and narrower. This means that V dq (V 1 , V 2 , R) þ V qq (V 1 , V 2 , R) have a uniform chance of taking values in most parts of the allowed regions in Figure 2. In the remaining parts, the chance quickly diminishes to zero as the value reaches the maximum or minimum limit. With respect to the interesting question concerning which is the major component in the range from R ¼ r c,min to R ¼ 4.3 Å , one can roughly claim that V atomÀatom (V 1 , V 2 , R) has a good chance of having larger magnitudes than The chance is reduced significantly at R ¼ 3.8 Å and becomes almost impossible at R ¼ 4.0 Å and beyond.
In addition, we provide some numerical results related to Figures 2 and 3(a) and (b) in Table 1. In columns 2 and 3 of the table we list the maxima and minima of 3.60, 3.80, 4.00, and 4.30 Å , respectively. Columns 4 and 5 give the maxima and minima of V atomÀatom (V 1 , V 2 , R). In the column headed Sum P Vaa50 we provide the probabilities with which V atomÀatom (V 1 , V 2 , R) has negative values. Finally, in the last column headed Sum P Vaa4MaxVdq , we list the probabilities with which values of V atomÀatom (V 1 , V 2 , R) are larger than the maxima of It is worth mentioning that the above discussions are based on the assumption that, with respect to all orientations of H 2 O, the accessibilities are equal. The same assumption also applies for N 2 . However, when one deals with specified states for H 2 O and N 2 , the chances of occupying orientations are not uniform and they are determined by the corresponding wave functions. As a result, one has to use a more sophisticated method by introducing absolute squares of the products of the wave functions of H 2 O and N 2 as the weighting functions in the averaging processes to find the probability distributions applicable for these states. However, we will not pursue this here, given the fact that when one calculates the half-width for a H 2 O line of interest, not only are its initial and final states involved but also many other states (for example, see those labelled by the summation indices j 0 i and 0 i in Equation (8)) participate. With respect to N 2 , all states are taken into account. Therefore, we believe the general trends presented above are more or less applicable here.
In summary, we can consider ( ¼ 3.83 Å ), one of the LJ parameters mentioned above, as a threshold when comparing V atomÀatom (V 1 , V 2 , R) and As long as R5, both are the main components of the total potential. For R4, the former becomes a minor component and, as R increases further beyond 4.3 Å , the effects of the former are very weak or even negligible. Values of V a-a and V dq +V qq (cm -1 ) Values of V a-a and V dq +V qq (cm -1 ) Distributions of V a-a and V dq +V qq   In the fitting procedure to optimize the parameters ij and " ij of V atomÀatom (V 1 , V 2 , R), Gamache and Laraia selected two lines (i.e. 6 1,6 5 2,3 at 22 GHz and 3 1,3 2 2,0 at 183 GHz) and made comparisons between theoretically calculated values and measurements [7]. According to our division, these two lines belong to the first category because their half-width values are relatively large (their air-broadened halfwidths listed in HITRAN 2006 are 0.0942 and 0.1014 cm À1 atm À1 , respectively). It is well known that air-broadened half-widths are smaller than N 2 -broadended half-widths because O 2 , a minor component of air, is less effective than N 2 , the major component of air, in causing line broadening. Based on the same potential model with optimized values of ij and " ij , we repeat their calculations for these two lines, but with more choices for the cut-off orders, the number of correlations, and the trajectory models.
In Table 2, we present our calculated results for these two lines at T ¼ 296 and 220 K, respectively. In the table, the letter c indicates correlations, p the 'parabolic' trajectory model, and e the 'exact' trajectory model. As shown in the table, the calculated halfwidths with the lowest eighth-order cut-off and only including 20 correlations are already converged. In addition, the differences between values derived from the 'parabolic' trajectory model and the more accurate 'exact' trajectory model are very small. However, one cannot simply assume that the applicability of these conclusions can be extended to all other lines. It turns out that they are valid only for lines with large half-width values.
Next, we present more detailed analyses for line 3 1,3 2 2,0 because the features of the other line 6 1,6 5 2,3 are very similar to that presented here. First, in order to explain, with respect to the first kind of cut-off, why calculated half-widths converge so quickly, we plot in Figure 4 Re(S 2 ), exp(ÀRe(S 2 )), and r c ð 02 c = 2 Þð1 À e ÀRe S2 Þ derived by including 20, 38, and 88 correlations and using the 20th-order cut-off. The latter is the integrand of the expression for the half-width given by Equation (1). Note that S 2 is a function of r c and, for simplicity, this argument is omitted here.
As shown in the figure, there are significant differences among the values of Re(S 2 ) derived by including 20 correlations and those obtained with more correlations in the range r c 53.83 Å . After r c 43.83 Å , the differences diminish. Concerning Re(S 2 ) alone, one can conclude that Re(S 2 ) derived from 20 correlations is not converged, but those from 38 correlations are well converged and those from 88 correlations are completely converged. On the other hand, by looking at the three blue curves representing exp(ÀRe(S 2 )), whose values have been multiplied by 1000 in the figure, one can see that the values for r c 53.83 Å are less than 0.007, which means that all three factors of 1Àexp(ÀRe(S 2 )) are always close to 1. Finally, one can see that the integrands represented by the three red lines are almost identical. An essential point in Equation (1), besides the lower integration limit r c,min , is that it is the integrand itself, and not Re(S 2 ), In order to explain, with respect to the second kind of cut-off, why the calculated half-widths converge so quickly, we plot in Figure 5 these three terms derived with the eighth-, 14th-, and 20th-order cut-offs. The plot demonstrates that, with respect to the second kind of cut-off, the calculated half-width of this line with the eighth-order cut-off is well converged. The patterns of Figure 5 are very similar to those given in Figure 4.
Thus, all conclusions drawn from Figure 4 are applicable here. In the present case, the values of exp(ÀRe(S 2 )) are less than 0.008 for r c 53.83 Å . Note that there are slight differences in the values of r c,min , the minimum values of r c , associated with the different cut-offs. With the 'parabolic' trajectory model, the value of r c,min is determined by the LJ parameters, because the cut-off introduced when evaluating V atomÀatom (R) affects the determination of its isotropic component, and thus the LJ parameters in the modelling, and consequently the value of r c,min from Equation (19).

Comments on the selection
Before proceeding with our analyses of other lines belonging to the second and third categories, we would like to comment on Gamache's choice of selecting the two lines 6 1,6 5 2,3 and 3 1,3 2 2,0 when optimizing the short-range potential V atomÀatom (R). In fact, based on the same arguments used to explain the different convergence behaviour between Re(S 2 (r c )) and the half-width, one can judge whether the selection of these two lines is a good choice for optimizing the shortrange site-site model.
As shown in Figures 4 and 5, changes in the choice of the cut-off can yield significant differences for the calculated Re(S 2 (r c )) for r c 53.83 Å . The smaller r c , the larger the differences. However, it is the variations in the values of V atomÀatom (R) resulting from different cut-off choices that directly cause these differences. Indeed, the values of V atomÀatom (R) can vary significantly over short distances as the cut-offs vary. The shorter R, the larger the variations. The similarity between the patterns of V atomÀatom (R) and Re(S 2 (r c )) demonstrate the intrinsic link between them. Of course, this is expected because Re(S 2 (r c )) is determined by the potential V tot (R) and, for short distances, V atomÀatom (R) becomes its major part. Now we come to our point. With respect to changes in Re(S 2 (r c )), what matters is not the source causing the variations in V atomÀatom (R), but the variations in V atomÀatom (R) themselves. As long as the variations in V atomÀatom (R) for short R distances occur on the same scale, no matter if they are caused by the cut-offs, by adopting different parameters ij and " ij or by something else, the changes in Re(S 2 (r c )) in the small r c region should remain constant. On the other hand, Re(S 2 (r c )) in the large r c region are mainly determined by the long-range interactions because the latter become dominant components at large R distances. There is no room for V atomÀatom (R) itself and its variations to play any role at all. Given the fact that lines with large half-widths are always associated with relatively large Re(S 2 (r c )) in the entire range and, in general, their magnitude peaks occur in the small r c region, one can conclude that the effects of V atomÀatom (R) on theoretically calculated half-widths are dramatically diminished. In fact, as shown in Table 2, there is almost no effect on the calculated halfwidths by varying the two cut-offs in the calculations. The same conclusion must be true if one varies the parameters ij and " ij of the site-site model.
In summary, one can conclude that the choice of these two lines to optimize the site-site model is poor. The calculated half-widths do not depend sensitively on the parameters ij and " ij at all. As a result, not only is one not able to find optimum values, but one may also unintentionally obtain values outside physically acceptable limits.

What really happens in the fitting process
Before completing our comments on line selection to optimize the site-site potential, we have to answer the question concerning what really happens in optimization practice. In contrast with what we have claimed above, one can obtain different half-width values for the two lines by using different parameters ij and " ij . In other words, by changing the values of ij and " ij , the theoretically calculated half-widths of these two lines can also be changed. In our opinion, what mainly happens in the fitting calculations are changes in the isotropic part of the potential. By varying the parameters ij and " ij of the site-site model, the isotropic interaction also changes. The latter is modelled by a LJ model with two parameters, and ". Because V dq and V qq are purely anisotropic interactions, they do not make contributions to the isotropic part at all. On the other hand, with the RB formalism, the isotropic potential plays an important role in determining the calculated half-widths. For example, if one selects the 'parabolic' model and uses Equation (1) to derive half-widths, the lower integration limit r c,min and the 'apparent' velocity v 0 c are functions of and ". Therefore, we expect that, in the fitting process, the main source of changes is and ", and not ij and " ij . There is further evidence to support our claim, but we will not present it here.

Lines in the third category
As an example, we choose the line 17 2,15 16 1,16 whose air-broadened half-width has been measured by different research groups [17,18]. Its measured airbroadened half-widths are small: the value provided by Rinsland et al. is 0.0207 cm À1 atm À1 and the value given by Toth is 0.0208 cm À1 atm À1 . Based on the same potential model as used above, we calculate the N 2broadened half-width for T ¼ 296 and 220 K with different combinations of the two kinds of cut-offs, and present the results (in units of cm À1 atm À1 ) in Table 3. As shown by the table, the calculated value with the eighth-order cut-off and including 20 correlations is not converged. In fact, the half-width at T ¼ 296 K derived from the eighth order and 20 correlations is 13.1% less than that from the 20th order and 20 correlations and 29.4% less than that from the 20th order and 88 correlations. With respect to the first cut-off, one must include 88 correlations. For the second cut-off, one has at least to use the 14th-order cut-off. Furthermore, by making adjustments from N 2 to air by dividing by the factor 1.09, one can compare calculated values with measurements. Thus, the calculated air-broadened half-width from the eighth order and including 20 correlations is about 0.0237 cm À1 atm À1 and that from the 20th order and 88 correlations becomes 0.0336 cm À1 atm À1 . It appears that the former agrees better with the measurements while adding more correlations leads to larger differences. If one only looked at this, one could claim that higher cut-offs yield errors and they must be abandoned. Of course, this specious argument is false.
In order to explain why the calculated half-widths for the line 17 2,15 6 1,16 converge so slowly, in Figure 6 we plot Re(S 2 ), exp(ÀRe(S 2 )), and r c ð 02 c = 2 Þð1 À e ÀRe S2 Þ derived by including 20, 38, and 88 correlations. The calculations are carried out with the 20th-order cut-off and the 'parabolic' trajectory model. By comparing Figure 6 with Figure 4, the magnitudes of Re(S 2 ) for this line are more than one order smaller than those for the line 3 1,3 2 2,0 and, in addition, they decrease more quickly as r c increases. In fact, after r c 43.83 Å their magnitudes are close to zero. By comparing the three black curves representing Re(S 2 ) derived by including different numbers of correlations, one can conclude that one curve differs significantly from the others. This indicates that the calculated Re(S 2 ) with fewer correlations are not converged. By looking at the three blue curves representing 10Âexp(ÀRe(S 2 )) in Figure 6, one can see that the values of exp(ÀRe(S 2 )) increase from 0 as r c increases from r c,min and they become close to an asymptotic value of 1 at r c 43.83 Å . Finally, one can see that the magnitudes of the integrands represented by the three red curves decrease very quickly and are close to zero for r c 43.83 Å . In contrast to Figure 4, the differences among these three red curves are significant.
Similarly, in Figure 7 we also plot these three terms derived with the eighth-, 14th-, and 20th-order cut-offs and including 20 correlations. The patterns shown in Figure 7 are very similar to those in Figure 6. Thus, all conclusions drawn from Figure 6 are also applicable here. The only difference is that, with respect to the second kind of cut-off, the plots in Figure 7 show that the results derived from the 14th cut-off are well converged and those from the 20th cut-off are completely converged.
In summary, by analysing Figures 6 and 7, one can draw two conclusions. The first is that the dominant contributions to the calculated half-width come from those collisions whose closest distances r c are less than 3.83 Å . The second is that varying values of V atomÀatom

Lines in the second category
Finally, we choose the line 9 9,0 8 6,3 whose air-and N 2 -broadened half-widths have been measured by Toth [18]. The measured values are moderate, 0.0525 and 0.0608 cm À1 atm À1 , respectively. Based on the same potential model, we calculate the N 2 -broadened half-width for T ¼ 296 and 220 K and present the results in Table 3. Looking at these values, one can see that the convergence behaviour of this line is better than for 17 2,15 16 1,16 , but is worse than for 3 1,3 2 2,0 . The half-width at T ¼ 296 K derived from the eighth order and 20 correlations is 1.7% less than that from the 20th order and 20 correlations and 10.9% less than that from the 20th order and 88 correlations. This demonstrates the general trend that the smaller the half-width, the worse the convergence. Agreement between the calculated half-widths and the measurements for 9 9,0 8 6,3 is not as bad as for 17 2,15 16 1, 16 . It is also apparent that the agreement improves by adopting higher cut-offs, but one line does not mean too much. Similarly, we show in Figures 8 and 9 how the calculated Re(S 2 ), exp(ÀRe(S 2 )), and r c ð 02 c = 2 Þ ð1 À e ÀRe S2 Þ for this line vary as one adopts different choices for the cut-offs. By comparing Figure 8 with Figures 4 and 6 and also Figure 9 with Figures 5 and 7, one can see that the Re(S 2 ) values of line 9 9,0 8 6,3 derived using different choices of the cut-offs also differ. However, their magnitudes fall between those of lines 3 1,3 2 2,0 and 17 2,15 16 1, 16 . As a result, the effects of Re(S 2 ) on the calculated half-widths in the region r c 53.83 Å are less damped than for 3 1,3 2 2,0 , but are more than for 17 2,15 16 1, 16 . This implies that V atomÀatom plays a more important role for 9 9,0 8 6,3 than for 3 1,3 2 2,0 , but is less important for 17 2,15 16 1,16 . We will not repeat the detailed discussions here because their profiles fall between those for the other two lines.

Summary of the analysis of the lines in the three categories
3.4.1. Comparisons among lines in the three categories In order to make clearer comparisons among the lines belonging to different categories, we present in Figure 10 the calculated integrands in Equation (1) using three sets of curves. The results corresponding to 3 1,3 2 2,0 , 9 9,0 8 6,3 , and 17 2,15 16 1,16 are plotted as the black, red, and green curves, respectively. For each of these lines, values derived from the lowest choice of the cut-offs (i.e. the eighth order and 20 correlations) and from the highest choice (i.e. the 20th order and 88 correlations) are represented by dotdashed and solid curves, respectively. As shown in the figure, the magnitudes of these three lines reach their maxima at r c ¼ r c,min and are almost the same. As r c increases, their magnitudes decrease at different rates. The 3 1, 3  of the calculated half-widths, these patterns are consistent with the magnitudes of their half-widths.
On the other hand, by looking at the differences between the results obtained from different cut-offs, one can see that these three lines have different convergence behaviour, ranging from the best, to intermediate, to the worst, respectively. Finally, in order to provide a quantitative measure of how many contributions to the calculated halfwidths come from specified collisions, one can introduce percentage contributions from collisions whose closest distances are in the range from r c,min to r c . The mathematical definition of this measure is (r c,min , r c )/(r c,min , þ1)%, where the two arguments of are the lower and upper integration limits in Equation (1) and the result is a function of r c . We have made calculations at T ¼ 296 K with the 'parabolic' trajectory model for all three lines and we present the results in Figure 11. In order to remove convergence errors, all the calculations are carried out with the 20th order and including 88 correlations. Because the order of these three lines is arranged according to their half-widths from the largest to the smallest, one can conclude from the figure that the smaller the halfwidth, the larger the percentage contributions from nearly head-on collisions. In fact, for the line 17 2, 15  to model the short-range interactions and nearly headon collision trajectories well. Any errors in modelling these could easily influence the calculated results and could thus lead to large uncertainties. As a result, we would like to make two comments for lines belonging to this category here. The first is that the theoretically calculated results in the literature are more likely unreliable because they were derived by adopting lower cut-offs. The second is that even using our method to solve the convergence challenge, one must also describe nearly head-on collision trajectories accurately. At present, we know that the 'exact' model is better than the 'parabolic' model, but it is still based on the assumption that the trajectory is determined by the isotropic part of the potential only. Unfortunately, one can expect that the anisotropic interaction can significantly affect the trajectories, and the closer the distance, the larger the effect. So far, no significant progress has been made in this area, and one has to keep in mind that this challenge remains.

3.4.2.
Guidance to select lines to optimize potential models When selecting lines to make comparisons between theoretically calculated half-widths and measurements, the main concern is usually the accuracy of the measurements. The preference is to choose important lines whose half-widths have been measured repeatedly and accurately. In addition, it seems there is a lack of further theoretical considerations as a guide to making choices. Analyses as described above could help theorists to make better choices. From the theoretical point of view, how to select target lines should depend on one's purpose. Thus, our suggestion is if one wants to optimize the short-range part of the potential, one needs to select lines with small half-widths, or if one wants to optimize the long-range part, it is better to choose lines with large half-widths.
It is well known that the model of the interaction potential between two molecules can be very complicated. Certain features of the potential may affect a given physical property, but other features may not. As a result, one has to combine all diagnostic means available to grasp the complexity. This general principle is also applicable even when measuring the same physical property. Therefore, it is better to select sample lines from all three categories and to optimize the long-and short-range potentials simultaneously. There are two arguments to support this claim. First, except for the isotropic part, the applicability of the anisotropic part of V atomÀatom has not been tested in the fitting process. This statement looks strange because the applicability of the total V atomÀatom has, indeed, been tested for the two lines selected by Gamache and Laraia. However, as we have explained above, in these tests, the effects from the anisotropic potential have been suppressed. This implies that the tests were not well done. How can one trust theoretically calculated half-widths for other lines whose significant contributions come from this untested anisotropic part of V atomÀatom ? Thus, the larger the contributions from V atomÀatom , the less reliable the results. Secondly, based on our understanding of the convergence problem, we are certain that many of the updated theoretical values in HITRAN are not converged. Because theoretically calculated half-widths and shifts play an important role in updating the HITRAN database, it is needless to say that nonconverged results should not be used.
We have calculated the N 2 -broadened half-widths of all 1639 H 2 O lines of the pure rotational band based on the same potential model used by Gamache and Laraia. In order to check the convergence problems related to the two kinds of cut-offs and associated with two different trajectory models, we performed many numerical calculations by adopting different choices of the cut-offs and trajectory models and we present some of our results below. 4.1.1. Convergence check for the second kind of cut-off In order to check the convergence behaviour due to the second kind of cut-off, we present in Figure 12 the calculated half-widths of the 1639 H 2 O lines of the pure rotational band listed in HITRAN 2006 by selecting the eighth-, 14th-, and 20th-order cut-offs. The calculations are performed including 88 correlations and adopting the 'parabolic' trajectory model. We also calculate the relative errors representing the differences in the half-width values obtained using the lower order cut-offs versus those obtained using the highest cut-off. These results are plotted in Figure 13. In addition, we divide the 1639 lines into three groups according to their error ranges. For example, comparing the results derived from the 88 and 20 correlations, we find that there are 297 lines with errors beyond 10%, 415 lines within 4-10%, and 927 lines below 4%. We show these general error statistics in Table 4. Both the maximum errors for the eighth-and 14thorder cut-offs (i.e. 42% and 4.2%, respectively) occur for the two lines 21 0,21 20 1,20 and 21 1,21 20 0,20 , both located at 390.499 cm À1 with the air-broadened half-width 0.0075 cm À1 atm À1 listed in HITRAN 2006. Based on Figures 12 and 13, one can conclude that, for those lines with small half-width values, the calculated results using the eighth-order cut-off are not converged. The results from the 14th-order cut-off are well converged and those from the 20th order are completely converged. The error plots in Figure 13 clearly demonstrate that the division into the three groups according to the relative errors coincides with the division according to their half-width values into the three categories discussed above.
It is worth mentioning here that these calculated relative errors correspond to the potential model used in the present study only. We expect that if one uses different potential models in the calculations, these results would vary somewhat; however, we believe that the general features would remain true. Similar claims are applicable for other cases presented later.

Convergence check for the first kind of cut-off
In order to check the convergence related to the first kind of cut-off, we present in Figure 14 the calculated N 2 -broadened half-widths for the 1639 lines by including 20, 38, 88, and 132 correlations and adopting the 'parabolic' trajectory model. The first three choices correspond to the values 2, 3, and 4 as the upper limit for L 1 and keeping 2 as the upper limit for L 2 . To include 132 correlations corresponds to setting 4 as the upper limit for both L 1 and L 2 . With respect to Half-width (cm -1 atm -1 ) Figure 13. Relative convergence errors of the calculated half-widths resulting from adopting the eighth-and 14th-order cut-offs versus the calculated half-width values from the 20th-order cut-off. The calculations are performed including 88 correlations and adopting the 'parabolic' trajectory model.

Number of lines
Half-width (cm -1 atm -1 ) Figure 12. Comparisons between calculated N 2 -broadened half-widths derived using the eighth-, 14th-, and 20th-order cut-offs. All calculations are carried out using the 'parabolic' trajectory model and including 88 correlation functions. The 1639 H 2 O lines of the pure rotational band are arranged in ascending order of their calculated half-widths from the 20th-order cut-off.
the highest first kind of cut-off (i.e. including 132 correlations), we calculate relative errors for the results obtained by including fewer correlations. We do not present a figure showing the error distributions for all lines of interest. Instead, in Table 5 we show the numbers of lines in the three error ranges. The maximum error (i.e. 70%) for 20 correlations occurs for the two lines 15 15  Comparison of the results derived using different trajectory models By comparing the results derived using the 'parabolic' and 'exact' trajectory models, one is able to ascertain the effects resulting from adopting a more accurate trajectory model. We compare the calculated halfwidths obtained with the highest cut-offs to avoid any convergence errors. The comparison is presented in Figure 15. As shown in the figure, in general, the values for the half-widths derived from the 'exact' trajectory model are smaller than those from the 'parabolic' model. We also present calculated relative errors in Table 6. The maximum error is 139% for the line 21 1,20 20 2,19 located at 408.325 cm À1 with an air-broadened half-width of 0.0075 cm À1 atm À1 in HITRAN 2006. Based on these results, one can conclude that the effects on the calculated half-widths from adopting the 'exact' trajectory model are significant.

4.1.4.
Comparison of the results obtained from the most accurate theoretical model and those commonly used in the literature In the present study, the most accurate and sophisticated method for repeating the same calculations is to adopt the 20th-order cut-off, to include 132 correlation functions, and to use the 'exact' trajectory model. The theoretically calculated half-widths of Gamache and Laraia were derived using the eighth-order cut-off, 20 correlation functions and the 'modified parabolic' trajectory model. Because we do not know exactly how the 'modified parabolic' model was derived, we prefer to repeat their calculations with the same cut-offs, but with the original 'parabolic' model. In addition, they used the original RB formalism and we use the modified RB formalism. As a result, our results derived from the eighth-order cut-off and 20 correlations do not match theirs exactly. However, the comparisons presented here are adequate to demonstrate the differences resulting from the adoption of the lower cut-offs and the simpler trajectory model versus the calculated half-widths using the most accurate method. We present the comparisons in Figure 16. We also provide some numerical measures for the relative

Number of lines
Half-width (cm -1 atm -1 ) Figure 15. Comparisons between the calculated N 2 -broadened half-widths derived from the 'parabolic' and the 'exact' trajectory models. The calculations are carried out using the 20th-order cut-off and including 132 correlation functions.  [7] are not 'true values' of the half-width that would be obtained without any artificial distortion due to calculation error. It is worth mentioning that our claim does not mean that the values obtained using our most accurate method represent real half-width values, because nobody has verified that the potential model used in the calculations is a 'realistic' model that can represent the interaction between H 2 O and N 2 well. In fact, we have repeatedly demonstrated that the current potential model not only suffers from poorly selected line choices in the optimization process, but also from the use of the lower order cut-off, including fewer correlations, and adopting the simpler trajectory model. As a result, we do not believe that the results derived from our most accurate method are able to predict all half-widths well. In addition, one has to keep in mind that the development of the method to perform numerical calculations is completely based on the modified RB formalism. The latter contains several approximations that could limit its applicability when predicting accurate half-width values. Therefore, it is better to consider the results derived from our most accurate and sophisticated method as the 'true results' within the modified RB formalism and the potential model without any artificial distortion.  Figure 16. Comparisons between the calculated N 2 -broadened half-widths derived using the most accurate and sophisticated method (i.e. adopting the 20th-order cut-off, including 132 correlations, and using the 'exact' trajectory model) and those from the method used in the paper by Gamache and Laraia (i.e., with the 8-th order, 20 correlations, and the ''parabolic'' trajectory model). These results are plotted by symbols D and x, respectively. and 'faulty results' representing this potential [7]. Thus, these conclusions become suspect. First, we present in Figure 17 comparisons between the calculated half-widths from the eighth-order cutoff, 20 correlations, and the 'parabolic' trajectory model and the air-broadened half-widths listed in the HITRAN 2006 database. In the plots, we have transformed the N 2 -broadended half-widths into air-broadened half-widths by multiplying by 1/1.09. From the figure, it appears that the agreement between the theoretically calculated results and the values listed in HITRAN 2006 is good, although the good agreement may be specious.
In Figure 18 we present a comparison between the theoretically calculated 'true' half-widths derived from the 20th-order cut-off, including 132 correlations,

Number of lines
Half-width (cm -1 atm -1 ) Figure 17. Comparisons between the calculated half-widths using the eighth-order cut-off, 20 correlations, and the 'parabolic' trajectory model and the air-broadened half-widths listed in the HITRAN 2006 database. The calculated N 2 -broadened halfwidths have been adjusted to the air-broadened half-widths by multiplying by 1/1.09.

Number of lines
Half-width (cm -1 atm -1 ) Figure 18. Same as Figure 17 except that the theoretically calculated results are derived from the most accurate and sophisticated method (i.e. adopting the 20th-order cut-off, including 132 correlations, and using the 'exact' trajectory model). and using the 'exact' trajectory mode and those listed in HITRAN 2006. As shown in the figure, the agreement becomes poorer, especially for lines in categories 2 and 3 (i.e. 50.075 cm À1 atm À1 ). This is not surprising because we already know that values derived from the most accurate method are significantly larger than those obtained from the simplest method, especially for lines with small half-widths. We note that some data in HITRAN 2006 may also be responsible for these discrepancies because many data come from theoretical calculations made by Gamache's group a few years ago. Among these, those with small values are not yet converged. In summary, these comparisons clearly demonstrate the necessity of finding a replacement for the current potential model.

Temperature dependence of the half-width
In atmospheric applications, knowledge of the line parameters at temperatures below 296 K is essential. Therefore, the HITRAN database provides a way of deriving these parameters at temperatures of interest from those listed for 296 K. One assumes that the T dependence can be described as where T 0 ¼ 296 K and n is the temperature exponent. We note that a positive n indicates that the T dependence is negative and a negative n means that the T dependence is positive. In HITRAN 2006, a value of n is assigned according to the value of jmj, where m ¼ Àj 00 for the P branch and m ¼ j 00 þ 1 for the R branch. For example, n ¼ 0.78 for jmj ¼ 0, 1, and 2, n ¼ 0.77 for jmj ¼ 3, n ¼ 0.73 for jmj ¼ 4, etc. This implies that, in HITRAN 2006, all lines are assumed to have a negative T dependence. Recently, several groups have questioned the correctness of this assignment formula because they have found that this simple assignment method does not always work well and could yield large errors [19,20]. In addition, based on theoretical calculations, they claim that the temperature exponent n can be negative for many transitions [7]. In general, we also doubt the applicability of this simple assignment of n in HITRAN 2006 for some lines and we are cautiously open minded to accepting possibly negative n values. Given the fact that lines claimed to have negative n values are those with small half-widths, whose calculated values are more likely to suffer from convergence problems resulting from both adopting lower-order cut-offs and including fewer correlations, the calculated temperature exponent n also inevitably has convergence problems. Thus, one cannot trust all conclusions drawn from previous theoretical calculations. Therefore, it is worth investigating the T dependence. Fortunately, if one adopts the 'parabolic' trajectory model, using our method to calculate at other temperatures of interest is not difficult because the correlation functions whose argument is defined as z ¼ 0 c t=r c are common for all temperatures [12]. In other words, in order to calculate (T ), one does not need to derive all the correlation functions again. However, if one selects the 'exact' trajectory model, one has to recalculate the correlations for each of the other temperatures of interest. We have calculated the half-width values for T ¼ 220 K for the two lines 6 1,6 5 2,3 and 3 1,3 2 2,0 belonging to category 1 with several choices for the cut-offs and trajectory models and have already provided these results together with those obtained for T ¼ 296 K in Table 2. For other sample lines (i.e. 17 2,15 16 1,16 and 9 9,0 8 6,3 ) in categories 3 and 2, respectively, we have also listed the corresponding results for T ¼ 296 and 220 K in Table 3. As shown in these tables, all lines in categories 1 and 2 exhibit a negative T dependence. For line 17 2,15 16 1,16 in category 3, one can see that the results obtained with the lowest cut-offs have a positive T dependence, but all other values from higher cut-offs indicate that the T dependences are still negative. This implies that, due to convergence errors, the prior claim is suspect.
Furthermore, in the recent paper of Gamache and Laraia [7], they plotted theoretically calculated T dependences for three lines: 19 7,13 18 6,12 , 2 1,1 2 0,2 , and 16 14,2 15 13,3 . The potential model is the optimized model, the same as used in the present study. Based on their plots, they claim that, for the first two lines, the values of n are positive, and for the last line, n is negative. It is easy for us not only to repeat their calculations, but also to make a comprehensive convergence check by using much higher cut-offs to guarantee convergence. Based on our checks, we find that their n value for 2 1,1 2 0,2 is well converged, that for 19 7,13 18 6,12 is reasonably converged, but their n for 16 14,2 15 13,3 is not converged at all. We present the convergence check for line 16 14,2 15 13,3 in Figure 19 but do not present the other two checks here. The calculations are carried out with the 'parabolic' trajectory model and with several choices for the cut-offs, including the highest one. As shown in the figure, the results derived from the two lower cut-offs (i.e. the eighth order plus 20 correlations and the 20th plus 20 correlations) exhibit strong positive T dependences, and those from the two higher cut-offs (i.e. the 20th plus 38 correlations and the 20th plus 88 correlations) exhibit a mild negative T dependence. Figure 19 demonstrates that the claim of a negative n for 16 14,2 15 13,3 is due to convergence errors. It is again worth mentioning that the results presented here are dependent on the potential model used in the calculations. Because the current potential model is not a good one, one should not consider these results as reliable theoretical predictions.

Uncertainty analyses and conclusions
In order to investigate the uncertainties associated with the calculated half-widths, let us first consider the individual uncertainties associated with contributions from certain types of collisions. One can claim that, in general, contributions from nearly head-on collisions contain the largest uncertainties and those from glancing collisions have the smallest uncertainties. Besides the fact that the magnitudes of the former are larger than those of the latter, there are other arguments to support this claim.
First, in comparison with the relatively well-established knowledge of the long-range potential, that of the short-range interaction is sparse. More specifically, precise expressions for the long-range multipole interactions are available and values of the multipole moments for H 2 O and N 2 are well known. This is not true for the short-range interactions. Thus, there is lack of guidance on how to select the short-range models, not only the parameters, but the functional forms themselves. Usually, one uses the LJ form in developing the site-site models. This preference does not result from physical considerations, but rather mainly from the technical convenience in manipulating the spherical expansions for the LJ function in terms of the inverse powers of r ij . As mentioned previously, how to choose the LJ parameters can raise other problems. Furthermore, we have shown that unless one uses sufficiently high cut-offs, there could be convergence errors, which mainly occur for nearly head-on collisions. Therefore, all these uncertainties are related to the short-range interactions.
When one calculates contributions from nearly head-on collisions, the short-range potential is the major, or even the dominant, component of the total potential. Thus, it must introduce uncertainties into the nearly head-on collision processes as well. On the other hand, glancing collisions are not sensitive to the short-range potential. Based on these discussions it is obvious that the uncertainties due to the potential model mainly occur for the nearly head-on collisions and become less for the more glancing collisions.
We now consider the uncertainties associated with the collision trajectories. As mentioned in Section 2.3, at present all the half-width calculations for the H 2 O-N 2 pair are based on the assumption that the trajectories are governed by the isotropic potential only. In addition, one also assumes that the translational motion can be treated classically. Except for these two assumptions, the 'exact' trajectory model is not based on any others. Thus, it is the most accurate and physically consistent model among the trajectory models. Therefore, we prefer this model. As shown in Figure 1, the 'exact' trajectories deviate dramatically as r c varies slightly for nearly head-on collisions, but they would change very little for glancing collisions. This pattern indicates that when one depicts trajectories for the nearly head-on collisions from the isotropic interaction, the small uncertainties of the latter could be significantly enhanced.
In addition, there are more profound effects resulting from the basic assumption that the anisotropic interaction does not play a role in determining the trajectories. The drawback of this assumption is not only that the couplings between the translation and internal motion are completely ignored, but also the trajectories derived from the potential are less accurate. At present, we are not able to estimate the effects from ignoring the couplings, but we know how to roughly judge the effects from unsuitably depicting the trajectories. Given the fact that the whole interaction, as well as its anisotropic component, show strong effects for nearly head-on collisions and have little effect on glancing collisions, one can conclude that large uncertainties in depicting the trajectory exist and these uncertainties would inevitably affect the calculated contributions from nearly head-on collisions. There is recent work that considers the effects on the trajectories of an anisotropic interaction when calculating the self-broadening of N 2 Raman spectra [21], and further extension to H 2 O-N 2 is needed.
At this stage, we would like to point out that there are other uncertainties associated with the current RB formalism. One is that contributions from S 3 , the third-order expansion of theŜ matrix, are not included in the expression for the half-width. Recently, there have been studies to derive formulas for the contributions from S 3 , and even from S 4 [22]; however, these formulas are only applicable for cases where the potential model does not contain site-site interactions [23]. Further work on this subject is required, and even after theoretical formulas become available, to carry out practical calculations remains a big challenge. (We expect that the coordinate representation could be a helpful tool in solving this problem.) Therefore, at present, it is not certain if the results calculated using only the first two terms S 1 and S 2 are converged or not. Another problem is associated with the assumption that, with respect to states of the absorber molecule, the resolvent operator appearing in the expression for the spectral density F(!) is diagonal. This implies that linecouplings are not taken into account in the calculations. Due to the importance of certain transitions (Q branches, band heads, etc.), there have been many recent studies on line-coupling for other molecules. There has been some progress for the H 2 O molecule, but more work on this subject is also required.
From the discussion presented above, one can conclude that large uncertainties are always associated with the nearly head-on collisions. Thus, with respect to contributions to the calculated half-widths of the line of interest, the determination of the contribution from nearly head-on collisions is a good measure of the uncertainties. On the other hand, we have already shown in Figure 11 that the largest contributions occur for lines in category 3 and the smallest for lines in category 1. Therefore, one can claim that the division of lines into different categories according to their achievable accuracies coincides with the division according to the magnitudes of their half-width values introduced previously. More specifically, theoretically calculated half-widths can achieve the highest accuracy for lines with large half-widths and they could be associated with the worst uncertainties for lines with small half-widths. Lines with moderate half-width values would contain moderate errors.
There are no sharp half-width boundaries to distinguish these three categories, but we would like to offer a suggestion. For example, according to the air-broadened half-width values listed in HITRAN 2006, the three categories can be divided into 40.075, between 0.075 and 0.045, and 50.045 (in units of cm À1 atm À1 ). With this division, of the 1639 lines in the pure rotational band of H 2 O, there are 268 lines in category 1, 661 lines in category 2, and 710 lines in category 3. Similarly, there are also no sharp uncertainty boundaries. However, we can provide uncertainty estimations for the accuracy of the theoretically calculated N 2broadened half-widths of the H 2 O lines that one can conservatively achieve. In contrast to laboratory data, where the measured half-widths of weak lines are more likely to have larger uncertainties than stronger lines, this is not necessarily true for theoretical calculations. Therefore, the uncertainty estimations are applicable for both weak and strong lines in the same category. We believe that, for lines in category 1, one is able to attain 5% accuracy, which is close to the 3% set by HITRAN for strong lines and better than the 10% set for weak lines. In contrast, for lines in category 3 it is impossible to meet the HITRAN requirements, even the 10% level. We believe that the estimations for lines in category 3 should be much worse. (Our most optimistic guess is at least 20-30%.) For lines in category 2, our estimation of the uncertainty is between these two extremes.
The above estimates provide the theoretical uncertainty that one is able to achieve with the current modified RB formalism. We note that the theoretical uncertainty here means that one is certain that the results presented arise based on a sound physical basis and the relative error from the true value is seldom beyond this limit. It is worth mentioning that, when we claim that we do not believe that the 10% accuracy is achievable for lines in category 3, that does not mean that values of the half-width of all these lines appearing in the literature contain at least 10% errors. In fact, it may be that some theoretical predictions may match the true values with smaller errors; rather, we expect that many lines in this category contain errors larger than 10%. The key point here is that due to the distortions occurring in the calculations, the trustworthyness built on sound physics could be lost. For example, no matter how good the calculated results look, if the values are not converged, they should not be accepted as real representatives of the physical properties.
There is an urgent need to provide accurate theoretical results for line widths and shifts for many practical applications. It appears that, at present, one still needs to rely on the RB formalism to perform calculations, because close-coupling calculations are not feasible for the H 2 O-N 2 pair. However, we believe that a responsible way to address the problem is when one provides theoretically predicted values, one should provide uncertainty estimations as well.
Finally, we would like to make two suggestions. First, when one updates the half-width values in HITRAN using all sources of measurements and theoretical calculations, it would be useful to distinguish the lines according to their half-width values. It would be prudent that, in addition to laboratory error considerations, one would favour the measurements more than the theoretical calculations for lines in category 3, and vice versa for lines in category 1. Secondly, when experimentalists determine their measurement priorities based on all their laboratory considerations, it would be prudent to consider one more: due to the weakness associated with the theoretical calculations for lines in category 3, measurements enjoy an extra reliability advantage.