Validation and Improvement of NeQuick Topside Ionospheric Formulation Using COSMIC/FORMOSAT‐3 Data

We examine systematic differences between topside electron density measurements and different topside model formulations including α‐Chapman, NeQuick, and an improved NeQuick (NeQuick‐corr) topside formulation, recently proposed. The global topside electron density data set used, was extracted from Radio Occultation (RO) topside electron density profiles on board Low‐Earth Orbit satellites from the COSMIC/FORMOSAT‐3 (Constellation Observing System for Meteorology, Ionosphere, and Climate and Formosa Satellite) mission. By using RO topside electron density measurements collocated with digisonde stations, we ensure that our investigation is based on two independent data sets (RO and digisondes). A subset of these profiles, with matched (within 5%) peak RO‐digisonde characteristics (foF2 and hmF2) is also exploited. This subset is exploited to extend the investigation on the basis of a higher quality validation data set. The comparison demonstrates that α‐Chapman and NeQuick‐corr underestimate, whereas NeQuick overestimates COSMIC topside electron density observations. The main outcome of this study is the significant NeQuick topside representation improvement that can be achieved near the F region peak, if the key parameter g, which controls the change of scale height with respect to altitude, is optimized to a value of 0.15 (compared to a currently adopted value of 0.125). The NeQuick‐corr topside formulation using the optimized g value of 0.15 outperforms all other topside formulations.

NeQuick topside formulation has also been adopted as one of three options to model the electron density in the topside ionosphere in the frames of the International Ionosphere Model (IRI)-2016 (Bilitza et al., 2017). The other two options are IRI-2001(Bilitza, 1990) and IRI01-corr (Bilitza, 2004). NeQuick is considered as the most reliable of the three options (Buresova et al., 2009;Strangeways et al., 2009) but according to past and recent studies there is still room for improvement (Bilitza, 2009;Bilitza et al., 2006;Pignalberi et al., 2016). The NeQuick topside model is based on an Epstein function (Nava et al., 2008) (as shown in Equation 1). The electron density profile (Ne (h)) is defined as a function of hmF2, NmF2 and effective scale height (Hm) as follows: The scale height in the NeQuick topside formulation is described by three parameters, scale height at the peak (H 0 ), a parameter (r) which restricts the scale height at higher altitudes and the altitude gradient of the scale height (g). A value of r = 100 and g = 0.125 is adopted in NeQuick topside formulation, while H 0 can be estimated from Equation 3, where foF2 is the peak critical frequency, NmF2 is the peak electron density, hmF2 is the height corresponding to NmF2, and R12 is the 12 months smoothed sunspot number: An improvement in the NeQuick topside formulation (NeQuick-corr, Pezzopane & Pignalberi, 2019) was recently proposed. This NeQuick-corr topside formulation is based on H 0 grids as a function of hmF2 and foF2, derived from electron density values measured by Langmuir probes on-board Swarm satellites (A, B, and C), generated by applying the IRI-UP (Update) method (Pignalberi et al., 2018). According to NeQuickcorr, H 0 is estimated using H 0, AC and H 0, B at two different altitudes for each pair of hmF2 and foF2 values, in accordance to Equations 5 and 6:

Data
The topside COSMIC RO electron density values used as a COMPARISON data set and the full topside electron density profiles subset used as a VALIDATION data set in this study. These data sets were extracted from RO electron density profiles downloaded from the COSMIC data analysis and archive center (CDAAC) (https://cdaac-www.cosmic.ucar.edu/cdaac/products.html). Both data sets were selected under a maximum separation requirement (<1° in latitude and longitude) from the corresponding digisonde station location, in order to control the quality of the assembled RO data set, based on a minimum colocation separation. Figure 1 shows a COSMIC RO profile ground projection with respect to latitude and longitude, where the red part of the profile identifies the bottomside and the blue part the topside profile projection on the ground, respectively. It also shows the nearest digisonde station (Nicosia station in this example) and the minimum (perpendicular) distance between digisonde station and topside profile projection. By ensuring a maximum separation requirement (<1° in latitude and longitude) between this RO topside measurement and digisonde location we could achieve sufficient colocation criteria between these two data sets, and form a COMPARISON RO data set that we have subsequently used in our study. Unrealistic RO profiles with excessive fluctuations in the topside electron density or with hmF2 values out of a realistic range [150 < hmF2 < 450 km] were discarded. In total, 29,063 topside electron density values corresponding to a period of 2006-2018, encapsulating most of the COSMIC mission time span, were extracted from these profiles to assemble the so-called COMPARISON data set. The autoscaled digisonde hmF2 and NmF2 data corresponding to the digisonde profile measurements within a time interval of less than ±15 min from the RO measurements were downloaded from the Digital Ionogram Data Base (DIDBase-http://giro.uml.edu/ didbase/scaled.php). To extend the investigation we have also focused on a more reliable VALIDATION full topside profile data set that was assembled as a subset of the COMPARISON data set under the additional requirement of maximum difference at the peak values (<5% difference in hmF2 and foF2), in accordance to a previous study by Shaikh et al. (2018). This, VALIDATION, was composed of 3,433 out of the 29,063 COMPARISON data set cases.
The selected digisonde stations, their location (latitude and longitude) and the number of nearest selected COSMIC topside values, and profiles forming the COMPARISON and VALIDATION data sets, are shown in Table 1. Figures 2a and 2b show hmF2 and foF2 scatter plots, extracted from the corresponding digisonde and COSMIC electron density profiles to form the topside COMPARISON data set. Both plots indicate a significant correlation between the two data sets in both hmF2 and foF2 with correlation coefficients of 0.68 and 0.82 respectively. The scatter plots for hmF2 and foF2 recorded from digisonde and extracted from RO electron density profiles considered in this more reliable topside VALIDATION subset are plotted in Figures 3a and 3b, clearly demonstrating, very high correlation coefficients at 0.98 and 0.99, respectively.
The α-Chapman topside electron density values were extracted from profiles that were calculated in terms of autoscaled hmF2, foF2 and scale height as shown below: To compare the full topside profiles recorded by COSMIC RO satellites and modeled by α-Chapman, NeQuick, NeQuick-corr, and New g _NeQuick-corr a relative difference (as a function of altitude beyond the peak) was calculated as shown below: with X = Chap, NeQ, NeQcorr, gNeQcorr for Y = α-Chapman, NeQuick, NeQuick-corr, and New g _NeQuickcorr, respectively and, where htop denotes the peak-relative altitude in km.
To investigate the overall performance in terms of the full profile in the various topside formulations, a Normalized Root Mean Square Error (NRMSE) was calculated for each of the 3,433 profiles of the VALI-DATION subset, using: SINGH ET AL.
where subscript RO refers to COSMIC measurements, while modeled to either α-Chapman, NeQuick, NeQuick-corr or New g _NeQuick-corr. N is the total number of electron density profile points.
The scale height (Hm) was calculated for COSMIC, α-Chapman, NeQuick and NeQuick-corr from the Epstein equation. in accordance to Pignalberi, Pezzopane, Nava, et al. (2020) and Pignalberi, Pezzopane, Themens, et al. (2020), as shown below: so that the equation becomes: By using the Sridhar Acharya formula, the solution for the above quadratic equation reduces to: SINGH ET AL.
10.1029/2020JA028720 6 of 18 and by solving Equations 13 and 17, Hm would be:

Results and Discussion
The comparison between topside electron density profile measurements and model formulations, as described in Section 2 is presented in the following sections. The results in section ( Figure 4a shows the binned scatter plot between peak-relative altitude (htop = h-hmF2) and relative difference (RD RO_Chap ) between RO observations and α-Chapman estimations, while the color bar shows the counts in each bin. As it can be seen from the graph, in the vast majority of cases RD RO_Chap is greater than zero which indicates that α-Chapman underestimates RO observations and this difference increases with htop with the bin occurrence maximizing around 500 km (above hmF2). The evidence from Figure 4a is justified because digisonde topside estimation is based on a α-Chapman function, with a constant scale height (as shown in Figure 4b), but real observations differ from α-Chapman estimates because scale height increases linearly with height over the peak (Olivares-Pulido et al., 2016). Figure 5a shows the binned scatter plot between peak-relative altitude (htop) and relative difference (RD RO_NeQ ) between RO observations and NeQuick estimates. It shows that NeQuick slightly overestimates SINGH ET AL.

Analysis Based on the COMPARISON Data Set
10.1029/2020JA028720 8 of 18  RO observations up to an approximate htop altitude of 300 km and then its behavior reverses underestimating RO measurements. NeQuick is based on an Epstein function to represent the topside profile with an approximately linear scale height (calculated using Equation 18, as shown in Figure 5b) and therefore its performance is superior to α-Chapman. NeQuick considers values of r = 100 and g = 0.125 for calculating the scale height. The error with respect to htop as shown in Figure 5a) could be attributed to the difference in the change of scale height with respect to htop (represented by g) between RO observations and NeQuick estimations (Themens et al., 2018). Figure 6a shows the binned scatter plot between peak-relative altitude (htop) and relative difference (RD RO_ NeQcorr ) between RO observations and NeQuick-corr estimates. It shows that NeQuick-corr underestimates RO observations and this underestimation increases with htop. The NeQuick-corr is equivalent to NeQuick but the value of H 0 is deduced from H 0,AC and H 0,B grids following Equations 5 and 6 as proposed by Pezzopane and Pignalberi (2019) and the scale height (as shown in Figure 6b) is calculated by Equation 18. As it is clear from Figures 5a and 6a, for the majority of cases NeQuick exhibits an approximate relative difference error in the range −0.2 to 0.4 and for NeQuick-corr the error lies in the range of 0-0.35, which demonstrates that NeQuick-corr outperforms NeQuick. The scale height behavior of RO observations is shown in Figure 7 as calculated from the COSMIC measurements in accordance to Equation 18. A careful inspection of this plot and its comparison with Figures 5b and 6b reveals that there is seems to be a notable difference in the slope for the RO data set with a higher value of g necessary to match this increased slope according to Equation 7.
The above results clearly indicate that the scale height calculated using different H 0 formulations is not able to match the scale height calculated from RO observations and that further potential improvement could be achieved by more appropriate values for r and g as indicated by other past studies (Themens et al., 2018). To explore this possibility, we used least squares to optimize the value of g and r, keeping H 0 constant, using NeQuick-corr which exhibits the lower relative dif-SINGH ET AL.
10.1029/2020JA028720 9 of 18  ference with respect to the RO data set (RD RO_NeQcorr ). The value of r was varied with a step size of one and g with a step size of 0.01. As the RO data were mostly limited to an altitude below 800 km, since r controls the scale height at higher altitudes, the value of r remained constant during this optimization (r = 100). Pignalberi, Pezzopane, Nava, et al. (2020) and Pignalberi, Pezzopane, Themens, et al. (2020) demonstrated that the effect of r on the scale height becomes significant for altitudes much higher than the F2 peak. Figure 8 shows the variation of r and g with respect to the RMSE calculated between RO observations and NeQuick-corr estimates. RO and NeQuick-corr comparison showed that for r = 100 and an optimized value of g = 0.15, RMSE minimizes. To evaluate the relative difference performance with this new optimized value of g, the Epstein equation with r = 100 and g = 0.15 was used to estimate the electron density (New g _NeQuickcorr) with a scale height calculated according to NeQuick-corr using H 0 extracted from the H 0,AC and H 0,B grid. Figure 9a shows the binned scatter plot between peak-relative altitude and the relative difference (RD RO_gNeQcorr ) between RO observations and New g _NeQuick-corr estimates. It shows that the (RD RO_gNeQcorr ) is almost constant with htop and it is confined within a range −01 to 0.2 which is lower than the other cases shown (Figures 4a, 5a, and 6a). Therefore, it can be stated that the performance of New g _NeQuick-corr method is better than the other three topside modeling approaches for this particular COSMIC RO data set. The scale height (calculated from Equation 18 for New g _NeQuick-corr method is shown in Figure 9b.

Analysis Based on the More Reliable VALIDATION Data Set
As already mentioned, this more reliable VALIDATION data set was actually a subset of the COMPARI-SON data set with an additional requirement for a maximum difference in hmF2 and foF2 (within <5%) at the profile peak. We have to stress though that the VALIDATION data set comprises of full topside electron density profiles whereas the COMPARISON data set analyzed in the previous sections comprises electron density values at various topside altitudes corresponding to the nearest distance to a digisonde station. Figures 10aand 10b show the binned scatter plot between peak-relative altitude (htop = h-hmF2) SINGH ET AL.
10.1029/2020JA028720 10 of 18  . The graph shows the binscatter plot of (a) relative difference (RD RO_gNeQcorr ) between COSMIC RO observations and New g _NeQuick-corr estimations (b) Scale height of New g _NeQuick-corr estimations as a function of peak-relative altitude (h-hmF2). and relative difference (RD RO_Chap (h)) between RO and α-Chapman profiles, for h-hmF2 > 100 and h-hmF2 < 100, respectively. The color bar represents the counts in each bin. As discussed in Section 3.1, α-Chapman underestimates RO observations and it increases with htop, which can also be observed from Figure 10a as RD RO_Chap (h) increases with htop. Figure 10b shows that up to 100 km over hmF2, the average RD RO_Chap (h) fluctuates around zero. This is expected as α-Chapman scale height is constant, around the peak.
Figures 11a and11b show binned scatter plots between peak-relative altitude (htop) and relative difference (RD RO_NeQ (h)) between RO profile and NeQuick estimated profile, for h-hmF2 > 100 and h-hmF2 < 100, respectively. Figure 11a shows that NeQuick overestimates (−0.5 to 0 for the majority of profiles) RO up to approximately htop = 300 km and then its behavior reverses with a definite underestimation (within 0-0.2 for most profiles). The results are similar with the findings discussed in Section 3.1 indicating that NeQuick SINGH ET AL.
clearly outperforms α-Chapman. Figure 11b shows that up to htop = 100 km, the average RD RO_NeQ (h) fluctuates around 0, which suggests that NeQuick also exhibits approximately constant scale height around the peak.
Figures 12a and12b shows the binned scatter plot between peak-relative altitude (htop) and relative difference (RD RO_NeQcorr (h)) between RO and NeQuick-corr, for h-hmF2 > 100 km and h-hmF2 < 100 km, respectively. Figure 12a shows that NeQuick-corr underestimates RO and RD RO_NeQcorr (h) increases (0-0.5) with htop. Unlike NeQuick, the behavior of NeQuick-corr does not reverse with htop, whereas the RD RO_NeQcorr (h) gets saturated for htop > 300 km. Figure 12b shows that up to htop = 100 km average RD RO_NeQcorr (h) fluctuates around zero suggesting that similar to α-Chapman and NeQuick, NeQuick-corr also exhibits nearly constant scale height around the peak.
NRMSE between RO and the three topside formulations was also calculated (using Equation 12). Figure 13a shows the scatter plot between the NRMSE values for NeQuick-corr (with respect to COSMIC) on x axis and NRMSE values for α-Chapman (with respect to COSMIC) on y axis. For the majority of cases NRMSE_α-Chapman exceeds NRMSE_NeQuick-corr, which means NeQuick-corr performs better than α-Chapman. Figure 13b shows the scatter plot between NRMSE_NeQuick-corr (with respect to RO) on x axis and NRMSE_NeQuick (with respect to RO) on y axis for each individual matched peak profile. It shows that NRMSE_NeQuick-corr is lower for nearly half the cases (1,803 out of 3,433) and NRMSE_NeQuick is lower for the rest (1,640 out of 3,433) but for the majority NRMSE_NeQuick-corr is more bounded (from 0 to 0.5) whereas NRMSE_NeQuick extends from 0 up to 0.8. Therefore, we can conclude that NeQuick-corr is superior to NeQuick for representing the topside ionosphere, based on this particular RO data set under consideration. Klipp et al. (2020) recently applied the NeQuick-corr method to study the comparison between the ionospheric total electron content from ionosondes and the International GNSS service vertical total electron content and reported that the error was reduced by 27%.
The optimized value of g = 0.15 calculated in Section 3.1 (with a value of r = 100) was also tested using the VALIDATION data set. Figures 14aand14b show the binned scatter plot between peak-relative altitude (htop) and relative difference (RD RO _ gNeQcorr (h)) between COSMIC and New g _NeQuick, for h-hmF2 > 100 and for h-hmF2 < 100. Figure 14a clearly shows that RD RO _ gNeQcorr (h) is almost constant with respect to htop and that it is confined within a region (−0.2 to 0.2). RD RO _ gNeQcorr (h) is also almost 0 for h-hmF2 < 100, as shown in Figure 14b. By comparing Figures 10-12 and 14, it is clear that New g _NeQuick (which is basically NeQuick-corr with a value of g = 0.15) outperforms all other topside formulations for the high-quality VALIDATION data set as well.
SINGH ET AL.

Topside Scale Height Linear Variation and Validation of Optimized Value of g = 0.15 Using VALIDATION Data
As discussed in Sections 3.1 and 3.2, the behavior of the topside scale height is expected to be linear. So, to verify this for all VALIDATION RO profiles (3,433 profiles) the scale height was calculated using Equation 18. The scale height of each profile was fitted under a linear approximation as shown in Figure 15a and subsequently the corresponding electron density profile was calculated, based on the estimated g value providing the best fit. Figure 15b shows the relative difference between measured and modeled electron density (using linearly fitted scale height) with the error within 5% for most of the profile points. This verifies the SINGH ET AL.
10.1029/2020JA028720 13 of 18  The graph shows the binscatter plot of relative difference (RD RO_gNeQcorr (h)) between COSMIC RO observed and New g _NeQuick-corr estimated matched peak electron density profiles for (a) h-hmF2 > 100 (b) h-hmF2 < 100 as a function of peak-relative altitude (h-hmF2).
linear scale height variation hypothesis, up to 500 km over hmF2 (Pignalberi, Pezzopane, Nava, et al., 2020;Pignalberi, Pezzopane, Themens, et al., 2020). Figure 16 shows the variation of g (calculated from Equation 7) with respect to RMSE between RO and linearly fitted scale height electron density profiles using the VALIDATION data set. It shows that for the majority of the profiles, a value of g = 0.15 (±0.015) minimizes RMSE. As it was discussed in Sections 3.1 and 3.2, for an optimum value of g = 0.15, the relative difference between RO and NeQuick-corr minimizes and exhibits the best performance among all four topside formulations tested on both COMPARISON and VALIDATION data sets as demonstrated by RD RO_gNeQcorr (Figure 9a) and RD RO _ gNeQcorr (h) (Figure 14a), respectively, but also in the histograms shown in Figures 17a and 17b. Themens et al. (2018) based on different data sets discussed that the NeQuick option can be improved over upper midlatitude and high latitude regions by adjusting r and g values to r = 20 and g = 0.2024. Another study by Themens et al., 2014 showed that NeQuick parameterization does not adequately represent the topside thickness during solar minimum between cycles 23 and 24. Our study has demonstrated a different optimum g value that is more suitable for midlatitudes. As shown in Figure 18 which depicts g-dependent histograms for specific high-latitude stations (a, b, and c) in the northern hemisphere, midlatitude stations (d, e, and f) in the northern hemisphere and midlatitude stations (g, h, and i) in the southern-hemisphere we can verify that g = 0.15 is apparently the optimum value irrespective of the digisonde location. Considering though the results from all 44 stations, we can also observe that there is a trend for increasing optimum g values as we move toward higher latitudes. This is evident from Figure 19 which is in agreement with recent evidence by Pignalberi, Pezzopane, Nava, et al. (2020) and Pignalberi, Pezzopane, Themens, et al. (2020) that the gradient of the modeled topside scale height (which is equivalent to g according to Equation 7) exhibits a distinct global spatial variation pattern with a clear increasing trend as a function of latitude.

Conclusion
This study was primarily based on the comparison of various topside formulation methods (α-Chapman, NeQuick and NeQuick-corr.) with an extended data set of COSMIC RO observations. Through a data set com-SINGH ET AL.
10.1029/2020JA028720 14 of 18  prising of 29,063 COSMIC RO topside electron density measurements in the vicinity of 44 digisonde stations we were able to identify particular weaknesses of these formulations and to solidify the linear scale height variation hypothesis (up to several hundred km over hmF2) as the best appropriate option to describe the topside electron density variation with height. Based on this comparison, we propose that a new value of g = 0.15 (as opposed to a currently adopted value of 0.125) is adopted to ensure the superior NeQuick-corr topside SINGH ET AL.
10.1029/2020JA028720 15 of 18 Figure 18. The graph shows the variation of parameter g with respect to its frequency for different stations, which belongs to different latitudinal range.
representation. To validate this new optimum g value in the frames of the NeQuick-corr topside formulation, a more reliable higher quality RO topside data set of 3,433 profiles with matched (within 5%) peak RO-digisonde foF2 and hmF2 characteristics was used. This enabled us to validate our results and to conclude that despite an optimum global value g = 0.15, there is a clear marked spatial feature according to which the optimum g value increases with increasing latitude. This could be the focus of future studies in order to improve topside electron density representation in operational models. The single-frequency GNSS correction algorithm (NeQuick-G) is an example of such an operational model. Despite the emerging trend for the adoption of dual-frequency chipsets (able to overcome the dispersive ionospheric impact in the positioning error budget) a vast array of future applications (such as Internet of things (IoT) device control) will continue to demand single-frequency ionospheric correction for which topside ionospheric contribution in TEC is significant.

Data Availability Statement
The topside COSMIC RO electron density values used as a COMPARISON dataset and the full topside electron density profiles subset used as a VALIDATION dataset in this study were extracted from RO electron density profiles downloaded from the COSMIC data analysis and archive centre (CDAAC) (https:// cdaac-www.cosmic.ucar.edu/cdaac/products.html). The autoscaled digisonde hmF2 and NmF2 data corresponding to the digisonde profile measurements within a time interval of less than ±15 min from the RO measurements were downloaded from the Digital Ionogram Data Base (DIDBase-http://giro.uml.edu/ didbase/scaled.php). The corresponding NeQuick values were estimated at the corresponding COSMIC topside electron density altitude using the FORTRAN IRI-2016 source code, available at http://irimodel.org/ by ingesting the hmF2 and foF2 autoscaled values. These hmF2 and foF2 values were also used to calculate H0 using the H0,AC and H0,B grid (downloaded from the supplementary data of the Pezzopane and Pignalberi, (2019)) in order to calculate the NeQuick-corr profiles. Figure 19. The graph shows the binscatter plot of variation of parameter g with respect to the latitude.