Exhaustive State-to-State Cross Sections for Reactive Molecular Collisions from Importance Sampling Simulation and a Neural Network Representation

High-temperature, reactive gas ﬂow is inherently non-equilibrium in terms of energy and state population distributions. Modeling such conditions is challenging even for the smallest molecular systems due to the extremely large number of accessible states and transitions between them. Here, neural networks (NNs) trained on explicitly simulated data are constructed and shown to provide quantitatively realistic descriptions which can be used in mesoscale simulation approaches such as direct simulation Monte Carlo (DSMC) to model gas ﬂow at the hypersonic regime. As an example, the state-to-state cross sections for the N( 4 S)+NO( 2 Π) → O( 3 P)+N 2 (X 1 Σ + g ) are computed from quasiclassical trajectory (QCT) simulations. By training NNs on a sparsely sampled noisy set of state-to-state cross sections it is demonstrated that independently generated reference data is predicted with high accuracy. State-speciﬁc and total reaction rates as a function of temperature from the NN are in quantitative agreement with explicit QCT simulations and conﬁrm earlier simulations and the ﬁnal state distributions of the vibrational and rotational energies agree as well. Thus, NNs trained on physical reference data can provide a viable alternative to computationally demanding explicit evaluation of the microscopic information at run time. This will considerably advance the ability to realistically model non-equilibrium ensembles for network-based simulations.

the smallest molecular systems due to the extremely large number of accessible states and transitions between them. Here, neural networks (NNs) trained on explicitly simulated data are constructed and shown to provide quantitatively realistic descriptions which can be used in mesoscale simulation approaches such as direct simulation Monte Carlo (DSMC) to model gas flow at the hypersonic regime. As an example, the state-to-state cross sections for the N( 4 S)+NO( 2 Π) → O( 3 P)+N 2 (X 1 Σ + g ) are computed from quasiclassical trajectory (QCT) simulations. By training NNs on a sparsely sampled noisy set of state-to-state cross sections it is demonstrated that independently generated reference data is predicted with high accuracy. State-specific and total reaction rates as a function of temperature from the NN are in quantitative agreement with explicit QCT simulations and confirm earlier simulations and the final state distributions of the vibrational and rotational energies agree as well. Thus, NNs trained on physical reference data can provide a viable alternative to computationally demanding explicit evaluation of the microscopic information at run time. This will considerably advance the ability to realistically model non-equilibrium ensembles for network-based simulations. a) m.meuwly@unibas.ch There are numerous situations in physical chemistry which involve a potentially large number of states and transitions between them. Examples include complete line lists for polyatomic molecules in hot environments (e.g. HITRAN 1 ) or state-to-state cross sections in reactive and non-reactive molecular collisions. Exhaustively probing and enumerating all relevant combinations or creating high-dimensional analytical representations is usually impossible. On the other hand, it has been shown for spectroscopic applications that omission of certain crucial states makes prediction or modeling of the spectroscopic band difficult or even impossible. 2 Another example is hypersonic flow around a space vehicle reentering the atmosphere. Temperatures can easily reach 20000 K where reliable experimental data is sparse and the energy distributions are out of equilibrium. In such an environment, the space vehicle is exposed to a collisionally dense environment which generates an immensely diverse population of rovibrational states between which collisions take place. 3 An accurate, fast, and reliable method is required to include this information in more coarse grained models to study the associated dynamics. The question thus is, how to best probe and represent a function with > 10 7 values for discrete input data, without explicitly computing each entry which may require thousands to millions of samples to statistically converge each of the entries.
The present work attempts to develop such a model for state-to-state cross sections σ v,j→v j (E t ) between initial (v, j) and final (v , j ) ro-vibrational states at given relative translational energy E t . For this, the N( 4 S)+NO( 2 Π)(v, j) → O( 3 P)+N 2 (X 1 Σ + g )(v , j ) reaction is considered because a) it is relevant in the hypersonic flight regime characterized by temperatures T ≤ 20000 K at which a multitude of the available ro-vibrational states are occupied and accessible, and b) an accurate, fully-dimensional and reactive potential energy surface (PES) is available. 4 Specifically, N 2 −formation rates from simulations on the 3 A and 3 A PESs are in favourable agreement with experiments for temperatures T ≥ 2000 K and N 2 -formation below 5000 K is dominated by processes on the 3 A surface. 4 State-tostate cross sections are typically required when modeling the reactive flow around a re-entry object with macroscopic dimensions using techniques such as direct simulation Monte Carlo (DSMC) 5 as the flows are usually not in thermal equilibrium. However, it should be noted that the present ansatz will be applicable to other relevant simulations and hypersonic flow is merely chosen because the occupation number of the available states is high and the number of states is large as well.
Scattering cross sections can be determined from quasiclassical trajectory (QCT) simulations. [6][7][8] Here, Hamilton's equations of motion are solved using a fourth order Runge-Kutta numer- To converge each of the σ v,j→v j (E t ) for given E t with a statistical significance of ∼ 10 % an estimated ≥ 10 13 trajectories would be required (10 7 trajectories to converge one cross section, see below; ∼ 10 4 initial states; ∼ 10 4 final states per initial state). Hence, using QCT simulations to directly sample all possible ro-vibrational initial states is computationally impractical, even for this simple 3-body system which also ignores the complexity of electronic states. 11 Thus, alternative approaches need to be explored.
To better define the QCT sampling problem, state-to-state cross sections for σ v=6,j=30→v ,j (E t = 2.5 eV) were considered. A total of 2 × 10 7 trajectories was run initially, which is considered as the reference. Out of the 3784 energetically accessible states, 3420 final states are found as products, i.e. 90.4 %. Compared to this, running fewer trajectories (8 × 10 4 , 1.6 × 10 5 , 1.6 × 10 6 and 9.6 × 10 6 ) finds 37 %, 54 %, 83 % and 90% of the final states, respectively, see Figure S1, which converges at approximately the cube root of the number of trajectories.
In addition, running too few trajectories leads to highly oscillatory cross sections due to the limited statistics of the final state.
Some computational gain can be obtained from exploring the fact that cross sections often vary smoothly with v and j. This allows to locally average computed cross sections according (1) where n v and n j are the bin widths for vibration and rotation over which the state to state cross sections are averaged. Figures 1A and B report the raw data from 1.6 × 10 6 trajectories and locally averaged cross sections from the same data set with n v = 2 and n j = 3, respectively. Comparison with the unaveraged result from N tot = 2 × 10 7 trajectories in Figure 1D illustrate the benefit of local averaging, see also Figure  Concerning the sampling strategy for choosing initial conditions it is worthwhile to note that large impact parameters b mostly lead to nonreactive collisions. A straightforward Figure S5) and the number of trajectories sampled in the interval b + db increases with increasing b, which leads to a larger fraction of nonreactive trajectories when b increases. For such situations, importance sampling (IS) 12 can provide a more advantageous protocol as those values of b for which reactive trajectories are more likely to occur are chosen with higher probability, which causes the cross section of all of the exit channels to converge at the same rate.
In order to determine the necessary weighting function w(b) for a particular trajectory the following strategy is used. For given (v, j, E t ) first a few thousand trajectories are run by Figure S5). The number of reactive trajectories as a function of b is fitted to From this, the distribution of the cross sections, g(b) = 2πbn(b) is determined. Next, 1.6 × 10 5 initial conditions are sampled from g(b) (see Figure S5), the trajectories are explicitly run and the weight of each trajectory is calculated according to The performance of IS is illustrated in Figures 1C and S4C which report the locally averaged cross sections from 1. found using IS from N tot = 1.6 × 10 5 compared with 44737 and 44979 reactive trajectories for N tot = 1.6 × 10 6 from conventional sampling, respectively. This leads to an efficiency increase by one order of magnitude when IS is used. The effect of IS on the convergence of state-to-state cross section can also be seen in Figure S2 and S3.
Overall, IS and local averaging lead to an estimated reduction of the required number of QCT trajectories by a factor of ∼ 60 which will be explored next to cover the entire state space for state-to-state cross sections for different selected initial states. Those will then be used for training a NN to construct a model to compute state-to-state cross sections.
To compute the necessary reference data to train the NN, 10 initial v−states (v = 0, 3, 6, 9, Thus, for a total of 1232 initial conditions in the (v, j, E t ) space, for each of them 1.6 × 10 5 QCT trajectories were run with IS of b. The cross sections from these ∼ 10 8 trajectories constitute the training set for learning the NN to predict σ v,j→v j (E t ) for any initial (v, j, E t ).
The network architecture used here is inspired by ResNet. 13 The NN transforms its input through four identical residual blocks 13 , after which a linear transformation followed by a scaled sigmoid function is used to obtain the final output, see Figure S6. Before they can be used as input to the residual blocks, the raw input features f ∈ R N in are first transformed by a linear transformation x = Wf + b to a vector x ∈ R F with the same dimensionality F as the hidden features in the residual blocks in order to simplify the formulation of skip connections 13 . Here, W ∈ R N in ×F (weight matrix) and b ∈ R F (bias vector) are parameters to be optimized. The residual blocks consist of two dense layers with the same number of nodes F (see Figure S6) and transform their input x l according to where the superscript l denotes parameters or feature representations of layer l, and W ∈ R F ×F and b ∈ R F are parameters. Two different activation functions, one for rectified linear units (ReLU) 14 and a self-normalizing 15 inverse hyperbolic sine (snasinh) 16 are used in the residual blocks. Use of residual blocks improves the flow of gradients between layers 13,17 and helps alleviate the "vanishing gradients problem". 18 After the last residual block, the final output is obtained from The entire reference state-to-state data set (∼ 10 7 ) has an average cross section of 0.00246 To test the quality of model NN-STS, 2 × 10 5 state-to-state cross sections were randomly chosen from the test data. The RMSE is 0.000328 a 2 0 for those data with a maximum deviation of 0.008 a 2 0 (for σ QCT = 0.211 a 2 0 , σ NN−STS = 0.219 a 2 0 ) and the correlation coefficient is R 2 = 0.993, see Figure S8. A similar analysis was carried out for model NN-Tot which has an average cross section of 16.18 a 2 0 for all 1232 data points. The RMSE is found to be 0.1552 a 2 0 and the correlation between NN-Tot and QCT is R 2 = 0.9997, see Figure S9, indicative of a high quality of the fit.  As another test, total thermal rates k(T ) were calculated from QCT simulations and compared with results obtained from NN the models, see Figure 5   In summary, the state-to-state cross sections for a reactive collision relevant to the hypersonic flight regime has been successfully modelled using a neural network based on explicitly calculated data from QCT simulations on an accurate, fully dimensional reactive PES. Such an approach is general and demonstrates that for situations in which large amounts of data constitute the relevant state space, subsampling and subsequent machine learning can provide a viable, accurate and computationally tractable alternative to exhaustive compu-tations.
work was supported by the United State Department of the Air Force which is gratefully acknowledged (to MM).

DATASET AND CODE AVAILABILITY
The code for training the NN-model is freely available from https://github.com/ MeuwlyGroup/NNcross together with input data and a sample training set.