Validation of atmospheric correction approaches for Sentinel-2 under partly-cloudy conditions in an African agricultural landscape

Globally, remotely sensed agricultural monitoring is impeded by cloudy conditions which render the acquired images useless. In semi-arid landscapes, rainfed croplands dominate agricultural production; thus, the majority of planting occurs in the rainy season, characterized by erratic cloud cover. The clear-sky pixels in partly-cloudy images can be used to increase the number of useful observations for quantitative analysis. This is achieved by cloud screening and atmospheric correction (AC) processes. However, the effectiveness of various AC approaches under partly-cloudy conditions is still unknown. Many studies validate surface reflectance (SR) under clear-sky conditions, with only a few focusing on the validation of SR under partly-cloudy conditions. This study sought to validate Sentinel-2 SR products derived from various AC approaches, i.e., MAJA, Sen2Cor, iCor, and FORCE, using in-situ spectral measurements. A partly-cloudy image with >60% cloud cover, acquired over a semi-arid agricultural landscape and ±3 days of the field measurements, was used as a test case. The results showed R2 of <0.2 with in-situ SR in the VIS and SWIR, and R2 of ~0.6 in the Red-edge and NIR regions, attributable to low aerosol scattering effect in the NIR. However, RMSE, MAE, and BIAS error metrics showed consistently higher errors across all AC approaches and crop types, with varying magnitude per spectral band. This finding can be attributed to the high presence of clouds in the image, which enhanced the apparent path radiance, causing an overestimation of the aerosol optical thickness, hence an underestimation of SR. The need for further validation of SR under various cloud conditions emanates from this study, to ascertain the utility of AC approaches under such conditions. Overall, the results have implications for the utility of remotely sensed images for agricultural crop monitoring in partly-cloudy conditions.


INTRODUCTION
Radiometric accuracy and consistency of optical remotely-sensed satellite data is a key requirement in quantitative vegetation studies [1,2]. Long-term and seasonal studies of vegetation phenology and stress are particularly sensitive to radiometric inconsistencies resulting from variability in sensor performance, inaccurate radiometric calibration, and changing atmospheric conditions. Although the risks of inaccurate and inconsistent radiometric measurements due to sensor performance and radiometric errors can be averted during the sensor design and calibration, the changing atmospheric conditions are inevitable. Therefore, optical sensors such as Sentinel-2 Multi-Spectral Imager (MSI) are susceptible to varying degrees of aerosols, haze, clouds, and cloud shadow contamination, which affect the quality and utility of acquired images. For example, the lower quality optical satellite images impede the accuracy and reliability of derived agricultural information, thus the full potential of satellite data as a source of information for agricultural applications that include supporting crops and land management decisionsand policy formulation.
Moreover, atmospheric conditions such as clouds limit the number of observations for agricultural monitoring applications, rendering the optical satellite images inappropriate as it comes with a larger delay (i.e., >5 days) than required to support timely agricultural management decisions [3]. For example, cloud cover spatial coverage in scenes acquired during the rainy season may exceed 90%, thus entirely obscuring the landscape at a particular period. However, optical satellite images are core to many applications due to their long-term, consistent, and global record. Consequently, clear-sky pixels in partly-cloudy images are used to increase the number of useful observations for quantitative analysis through cloud screening and removal algorithms, and subsequently, gap filling and pixel compositing [4][5][6][7].
To reduce the omission errors for remnant cloud contamination in the Function of mask (Fmask) algorithm, Frantz et al. [8] introduced a two-step cloud and cloud shadow screening method. Further, Frantz et al. [9] presented the Cloud Displacement Index (CDI) to improve the discrimination of built-up areas and clouds and the detection of low altitude clouds in Fmask, which result from the unavailability of the thermal band in Sentinel-2 multi-spectral imager (MSI) images. A recent study [7], demonstrated that a newly developed Fmask algorithm (i.e., version 4.0) performed better than its predecessor (i.e., version 3.3) for Landsats 4 -8, while it was significantly better than Sen2Cor algorithm (version 2.5.5). Wei et al. [10] proposed a Random-Forest-based cloud mask (RFmask) based on a Random Forest (RF) algorithm and super-pixels extracted using segmentation. The authors found that the algorithm could detect sparse and thin clouds over both bright and dark surfaces in Landsat-7 enhanced thematic mapper (ETM+) and Landsat-8 operational land imager (OLI) images at an overall accuracy of >90%. Another study [11], proposed a cloud detection algorithm based on hybrid multi-spectral features (HMF), which combines the multiple features, i.e., Haze-optimized Transformation (HOT), normalized difference vegetation index (NDVI), and WHITENESS, and RF algorithm. Their results, using GF-1 images, showed a mean accuracy of >90% and a reduction in processing time by >80%. Recently, the new approaches for cloud removal, based on deep learning algorithms, have been proposed for various optical sensors [12,13]. While these methods are useful for removing clouds and cloud shadows, the radiometric integrity of the cloudfree pixels in partly cloudy images have been rarely studied. This is indispensable since the atmospheric gases and aerosols affect: (i) the accuracy, integrity, and consistency of optical radiometric measurements, (ii) estimation of biophysical parameters such as leaf area index, (iii) image classification accuracy, and (iv) reliability of downstream applications [14][15][16]. Moreover, aerosols are known to enhance the atmospheric backscattering signal and cause the attenuation of the surface directional reflectance signal [17]. Ohde [18] determined the sensitivities of chlorophyll-a concentration estimation methods to atmospheric dust and clouds. Their results showed that dusty skies caused an overestimation of >8% in chlorophyll-a concentration. In contrast, the atmospheric conditions consisting of mixtures of clouds and dust resulted in overestimations of between 7% to 14%. The atmospheric effects are removed or minimized through atmospheric correction (AC), to improve comparability of radiometric measurements collected over regions with various aerosol loading.
Fortunately, a wide spectrum of AC approaches exists, presenting prospects for improved image quality and better radiometric accuracy and consistency, especially for new generation sensors such as Sentinel-2 MSI, with improved spectral, temporal and spatial resolutions. Furthermore, most of the AC approaches incorporate cloud masking procedures as a critical preprocessing step. The Second Simulation of Satellite Signal in the Solar Spectrum (6S) [20] is one of the popular Radiative Transfer Models (RTMs) for atmospheric correction of various optical data such as data from advanced very high resolution radiometer (AVHRR) [21], moderate resolution imaging spectroradiometer (MODIS) [22], Landsat thematic mapper (TM) [23,24], Advanced Wide Field Sensor (AWIFS) [25] and Landsat-8 operational land imager (OLI) [26,27]. Recent studies [28][29][30][31][32] demonstrate its applicability for Sentinel-2 MSI. Since its launch, various AC approaches for Sentinel-2 MSI have been developed [31,[33][34][35], validated and inter-compared [28][29][30][36][37][38][39][40]. However, studies [28,35,38,39] report inconsistent performance according to different sensors, land covers, spectral bands, and environments. Sen2Cor [33], specifically developed for Sentinel-2 MSI data, has been adopted as the current state-of-art operational approach to provide surface reflectance (SR) products. Its algorithm is based on the ATCOR model [41]. It converts the Level-1C top-of-atmosphere (TOA) image data to SR based on the libRadtran database of look-up tables (LUTs) generated for a wide variety of atmospheric conditions, solar geometries and ground elevations. By comparing the simplified model for atmospheric correction (SMAC), 6S and Sen2cor approaches, Wei et al. [32] found that Sen2Cor was the most accurate SR and NDVI the least accurate. In another study, Sola et al. [28] found that MACCS-ATCOR Joint Algorithm (MAJA)an operational multi-temporal atmospheric correction and cloud screening algorithm based on Multi-Mission Atmospheric Correction and Cloud Screening (MACCS) and atmospheric correction software (ATCOR) developed by French Centre National d'Études Spatiales (CNES) and the German Aerospace Center (DLR)performed better than Sen2Cor and 6S approaches when considering band-wise performance. Further, along with iCoran image-based AC algorithm available as a SNAP plugin previously known as OPERA [42] -MAJA showed better performance in classifying land covers in the Mediterranean environment. Therefore, there is no universal AC approach, hence the need to continuously validate AC approaches in various environments and conditions to ascertain that remote sensing measurements and methods produce self-consistent and accurate biogeophysical and chemical properties [43]. Additionally, validation of AC products is useful to inform further development of the AC algorithms and operational use that supports the implementation and monitoring of food security policies [16,44].
Critically, the effectiveness of various AC approaches under partly-cloudy conditions is unknown. Previous studies validated surface reflectance (SR) derived from various AC approaches under clear-sky conditions [28,38,40], with only a few focusing on the SR derived from partly-cloudy images [45]. Additionally, the performance of AC approaches in semi-arid agricultural areas is still unclear, with most studies rarely focusing on validation of AC approaches across different crop types. For example, previous studies focused on the inter-comparison of AC approaches over inland and coastal waters [35,39], and broad land cover categories [28,38]. Therefore, this study specifically validated Sentinel-2 SR products derived from MAJA, Sen2Cor, iCor, and FORCE using in-situ spectral measurements on a semi-arid agricultural landscape. Moreover, each AC approach evaluated in this study performs cloud masking as part of the AC procedure. This is an essential feature since the Sentinel-2 Level 1-C cloud mask product is unreliable. As a secondary objective, we compared the cloud masks from each approach. The test data for this study consisted of partly-cloudy Sentinel-2 image with ~60% cloud cover, acquired over a semi-arid agricultural landscape, and ±3 days of the field measurements.

Study area
The study area is in the vicinity of Harrismith (28°24´53˝ S and 28°34´25˝ E) in the Free State province, South Africa ( Figure 1). The study area is one of two sites in South Africa used by the Group on Earth Observations -Global Agricultural Monitoring (GEOGLAM) initiative's Joint Experiment for Crop Assessment and Monitoring (JECAM) for testing, calibration and validation as well as recently funded projects such as Enhancing Food Security in African AgriCultural Systems with the support of Remote Sensing (AfriCultuReS, Grant Agreement No. 774652). Free State province constitutes the main agricultural production zone of South Africa, characterized by both commercial large-scale and small-holder farming of crops and livestock. At the study area, the summers are characterized by warm temperatures, i.e., an average of ~19.2°C and wet conditions, with an average precipitation of 115 mm. Winters are mildly cold with average temperatures of 6.9°C and a fair amount of rainfall, i.e. an average of 8mm, which results in soil moisture sufficient for winter wheat cropping. The summer crop calendar starts from December and ends in May or June, while the winter crop calendar begins in May and ends in October or November. The dominant summer crops are maize, soya, and dry beans, while winter crops include wheat and barley. Cultivation occurs on generally flat, undulating plains, characterized by well-drained sandy to sandy-loamy soils. Sentinel-2 Level-1C image, acquired on the 18 th of March 2019, was retrieved from Copernicus Open Access Hub (https://scihub.copernicus.eu/dhus/). Sentinel-2 Multi-Spectral Instrument (MSI) provides images in 13 spectral bands of three various spatial resolutions, i.e., 10m (B2-Blue: 490nm, B3-Green: 560nm, B4-Red: 665nm, B8-NIR1: 842nm, and B8A-NIR2: 865nm), 20m (B5-Red-edge: 705nm, B6-Red-edge: 740nm, B7-Red-edge: 783nm, B11-SWIR1: 1610nm, and B12-SWIR2: 2190nm) and 60m (B1-Deep blue: 443nm, B9-NIR3: 945nm, and B10-Cirrus: 1375nm). All spectral bands were used for this study, with 60m bands dedicated to atmospheric corrections and cloud screening [46]. The image acquired on the 18 th of March 2019 was chosen as it was the only available image closest (i.e., ±3 days) to the dates of in-situ spectral observations. Apart from this, cloud contaminated images are common during the rainy season. Therefore, the performance of AC approaches in such conditions should inform the utility of derived SR for agricultural monitoring.

Ancillary data Shuttle Radar Topography Mission (SRTM) Digital Elevation
Model at approximately 90m spatial resolution was obtained from the Consortium for Spatial Information (CGIAR-CSI, http://srtm.csi.cgiar.org/srtmdata/). The SRTM DEM is considered to be one of the most accurate, freely available global sources of digital elevation data. For this reason, it is also used as a pre-loaded digital elevation dataset in relevant applications by SNAP software, which is developed by the European Space Agency for processing images derived by the Sentinel-1 and Sentinel-2 satellites. SRTM 90m DEM (Version 4.1) was used for all AC approaches for consistency. The DEM is warped to the extent and resolution of the image being processed using bilinear resampling.

In-situ data
The spectral reflectance samples were collected between 11 th to 15 th of March 2019 from plots (i.e. parts of fields with a pre-defined size of 40m × 40m) using the Spectral Evolution PSR-3500 Spectrometer (Spectral Evolution, Inc. © 2014) with a spectral range of 350 -2500 nm. The spectrometer has ~3.5nm spectral resolution at 350 -1000 nm, 10nm at 1500 nm and 7 nm at 2100 nm. The spectral bands from 350 -1000 nm, at 1500 nm and 2100 nm have nominal spectral sampling intervals of 1.5 nm, 3.8 nm, and 2.5 nm, respectively. For each 40m × 40 m plot, GPS coordinates were collected at the centre using handheld GPS receiver, i.e. Spectra Precision MobileMapper® 120 with <50 cm accuracy. Canopy reflectance measurements were then collected by consistently holding a fibre optic cable (i.e. 25° field of view) at nadir angle and an observation distance of 0.5 m above the canopy. This resulted in an instantaneous field of view (IFOV) with a radius of approximately 11.08 cm. Therefore, each plot was represented by an average of nine homogeneous spectral measurements taken at distinct random locations, separated by an approximate distance of 10 m. This ensured that the spectral measurements were a representative mixture of reflectance as determined by the proportion, physical arrangement and reflective properties of plants [47]. All reflectance measurements were taken between 10:00 and 14:00 Central African Time (CAT) and normalized by taking a white reference panel reflectance before every measurement [48,49]. The number of plots and spectral observations per crop type (Table 1) was limited by suitable atmospheric conditions, i.e. cloudless or cloud windows, and the short time available for spectral measurements, i.e. 4 hours. The in-situ reference SR measurements were resampled to Sentinel-2 band specifications based on its spectral response functions (S2-SRF version 3.0., released December 2017) as implemented in hsdar R package (version 1.0.0., released May 2019) [50].  [34,51]. Sen2cor (version 2.8, released February 2019) is an atmospheric correction processor for Sentinel-2, available as a stand-alone command-line tool and a plugin to Sentinel Application Platform (SNAP) [33]. iCor (version 2.0, released February 2019), is also available as SNAP plugin for performing image-based atmospheric correction of Sentinel-2 MSI, Sentinel-3 and Landsat-8 Operational Land Imager (OLI) [35,42]. Framework for operational radiometric correction for environmental monitoring (FORCE) version 2.1, released 2018), is a scientific software for mass-processing and analysis of medium-resolution images from Landsat and Sentinel-2 missions [4,8,9,31,[52][53][54][55].

Bidirectional Reflectance Distribution Function (BRDF) correction
To account for variations in reflectance caused by different solar and viewing geometries between the in-situ reference SR and Sentinel-2 SR, we corrected the bidirectional reflectance distribution function (BRDF) effects using the c-factor approach [56,57]. The c-factor approach is based on a semi-empirical RossThick/LiSparse-Reciprocal BRDF model [58]. The model uses a set of fixed BRDF spectral model parameters derived by averaging over 15 billion pixels of a global year of all the highest quality, snow-free MODIS 500m BRDF product (MCD43). SR products derived using various AC approaches were normalized to nadir-view, i.e. nadir BRDF-adjusted reflectance (NBAR) following Eq. 1 below: ) , ( Where: is the Sentinel-2 SR with view zenith and solar zenith for wavelength , and is the nadir BRDF-adjusted SR equivalent estimated for solar zenith angle for a nadir view zenith . The values were estimated from spectrally equivalent MODIS bands [56] using RossThick/LiSparse-Reciprocal BRDF model [58], and from interpolated MODIS spectral BRDF model parameters for red-edge bands [57].

Surface reflectance validation
Atmospheric correction (AC) approaches were validated against in-situ reference SR using statistical metrics (i.e., coefficient of determination, R 2 , and p-value) based on a fitted line of Ordinary Least Squares regression (OLS) between in-situ reference SR and derived SR from each approach. Additional statistical error metrics, i.e., Root-Mean-Square Error (RMSE), Mean Absolute Error (MAE), Bias, and Mean Absolute Percent Error (MAPE) were used for error assessment.

RESULTS AND DISCUSSIONS
Atmospheric correction (AC) is a fundamental image processing step required for quantitative analysis of remotely sensed data. Additionally, the new data architectures such as data cubes powered by increasingly free and open data, hinge on AC for deriving ARD. Therefore, this study sought to validate Sentinel-2 MSI SR products derived from various AC approaches, i.e., MAJA, Sen2Cor, iCor, and FORCE, using in-situ reference SR measurements resampled to Sentinel-2 band specifications. A partially cloudy image with >60% cloud cover, acquired on the 18th of March 2019 (±3 days of the field spectral measurements), was used as a test. The results are presented in Figures 2 -7. Overall, Sen2Cor shows better visual quality, while all other AC approaches seem to be affected by haze when subjected to the same image stretching parameters, as can be seen in Figure 2. As part of the atmospheric correction, each AC approach performs cloud masking using different techniques outlined in Table 2. An illustration of clouds and cloud shadow masks generated by each AC approach is indicated in Figure 3. As can be seen in Figure 3, the AC approaches were marginally different in their capability to detect and characterize clouds and their associated shadows. However, non-negligible errors can be seen from the Sen2Cor cloud mask (derived from Cloud Probability output layer), where opaque cloud pixels were missed. Also, cloud shadow detection errors are notable from iCor and FORCE masks (derived from the Haze Optimized Transformation output layer). At the same time, MAJA seems to provide better generalization for the majority of such pixels. Generally, SR products from various AC approaches show consistently low and insignificant (p > 0.05) overall (i.e., combined) OLS relationship, i.e., R 2 with in-situ reference SR in the VIS (i.e., B2 -B4) and SWIR regions (i.e., B11 and B12). However, an analysis per crop type reveals higher OLS regression relationships that were masked in the overall (i.e., combined) analysis. For example, despite lower overall fit in the VIS region, i.e., R 2 <0.2, SR derived from MAJA ( Figure 4) over soya beans and dry beans showed considerably better fit with in-situ reference SR, i.e., R 2 > 0.7 in B2 and B4 respectively. Generally, the results show consistently better OLS fit, i.e., R 2 > 0.7 in B4 for SR over dry beans and consistently moderate OLS fit, i.e., R 2 ranging from ~0.3 to ~0.5 in B2, B3, B5 and B12 across all AC approaches, except for B5 SR derived from iCor ( Figure 6) that showed no relationship, i.e. R 2 < 0.2. On the other hand, SR over soya beans has consistently moderate fit, i.e., R 2 ~0.3 to ~0.5 in B2 and B12 across all AC approaches, except for iCor that showed lower fit for B12. However, iCor SR over soya beans show considerably better fit, i.e., R 2 of about 0.67 in B11 and a significant R 2 of 0.99 (p=0.03) in B5, FORCE SR ( Figure 7) over soya beans show moderate fit, i.e., R 2 ~0.3 in B11. Similarly, SR derived from MAJA over maize show consistently moderate fit, i.e., R 2 ~0.3 in B3 and B5.
On the other hand, the results show that overall (i.e. combined) OLS relationship and significance improves in the Rededge (i.e. B6 -B7) and NIR (i.e. B8 and B8A) spectral regions, except for B5 (RE) that exhibit lower fit with in-situ reference SR. Specifically, MAJA SR in these regions shows better fit, i.e. R 2 from ~0.63 to ~0.64, followed by iCor and FORCE that have R 2 ranging from 0.6 to 0.62, while Sen2Cor show relatively lower but significant fit, i.e. R 2 of ~0.6 to 0.61. The results of OLS relationship by crop type show consistently better relationship over dry beans, i.e. R 2 >= 0.7 across all AC approaches, except for B6 SR derived from iCor that shows moderate R 2 of ~0.35. On the other hand, SR over soya beans shows consistently high negative OLS relationships with in-situ reference SR, i.e. R 2 ranging from 0.75 to 0.99 in B6 across all AC approaches. SR over maize show consistently moderate R 2 of between 0.5 and 0.57, i.e. highest in B6 and lowest in B8A respectively, except for B6 SR derived from iCor that shows no relationship, i.e. R 2 < 0.2.    On the contrary, the overall (i.e., combined) error metrics such as RMSE, MAE and BIAS were consistently lower in the VIS and SWIR spectral regions, and higher in the Red-edge (RE) and NIR spectral regions across all AC approaches and crop types. For example, results show lower errors in the VIS and SWIR spectral regions, i.e., RMSE, MAE, and BIAS < 0.1, while RE and NIR spectral regions exhibited relatively higher errors, i.e., RMSE, MAE and BIAS > 0.1, except for B5 that had errors of < 0.1. Band-wise error metrics by crop type show no divergence from the overall error metrics. Overall, the results show differences in the performance of various AC approaches, as well also showing these differences by crop type.
Validation of SR products showed consistent performance across different AC approaches, i.e., lower OLS regression relationships in the VIS and SWIR, and better relationships (R 2 values >0.6) with the in-situ reference SR in the Rededge and NIR regions, with minor variations between AC approaches. However, SR derived from MAJA achieved better overall OLS regression fit with R 2 values > 0.62 in the RE and NIR regions, followed by iCor and FORCE with R 2 values of ~0.61 and 0.62. While, Sen2Cor had a relatively lower fit in these regions, i.e., R 2 values of ~0.6 to 0.61. MAJA's better performance can be attributed to its robust hybrid algorithm design that combines both multi-spectral and multi-temporal methods [59]. In this study, MAJA used images covering the entire growing season, i.e., 36, instead of a single image to estimate Aerosol optical thickness (AOT) and for atmospheric correction. As a result, it was robust in characterizing surface reflective properties and removing aerosol effects. However, the differences between MAJA's performance and other AC approaches were not significant. It should be noted that the use of ancillary aerosol data such as Copernicus Atmosphere Monitoring Service (CAMS), instead of the constant aerosol model has reduced bias between predicted SR and in-situ SR [60]. This approach can potentially improve the current MAJA results. Nevertheless, the performance of multi-spectral methods, i.e., Sen2Cor, iCor and FORCE that retrieves AOT based on dark, dense vegetation (DDV) pixels within the image performed similarly to the MAJA's hybrid approach, where the differences in R 2 and error metrics (RMSE, MAE, and BIAS) were negligible. This is in agreement with Hagolle, Huc, Villa Pascual and Dedieu [51]. They note that the multi-spectral methods perform better than multi-temporal methods in vegetated landscapes such as agricultural landscapes. The results in this study, indicating lower fit with in-situ reference SR in the Sentinel-2 VIS region (i.e. B2 -B4). Better fit in the RE (i.e., B6 and B7) and NIR region (i.e., B8 and B8A) are expected since NIR region has low aerosol scattering effect than the VIS region [61]. However, these results are inconsistent with previous studies [28] that found a better fit in the VIS bands for MAJA, iCor, and Sen2Cor approaches and relatively lower fit in the RE and NIR bands. These inconsistencies and significantly higher R 2 values, i.e., R 2 >0.9 across all bands and approaches found in previous studies [28,62] can be attributed to the availability of almost simultaneous in-situ reference SR data with Sentinel-2 cloud-free overpasses, the higher number of samples, i.e. ~200 spectra per plot, and more repetitions of in-situ spectral measurements. Although the time gap between field spectral measurements and satellite overpass, i.e., ±3 days as well as high cloud cover, i.e., >60% were a limitation in this study, the results address the knowledge gaps about the effectiveness of various AC approaches in cloudy atmospheric conditions. In addition, this study has implications for the use of SR during high cloud cover conditions (i.e. common during the rainy season) especially for agricultural monitoring applications.
Sentinel-2 MSI VIS B2 -B4 exhibited relatively low R 2 , i.e., <0.2 across all AC approaches, attributable to the high presence of clouds in the image. Clouds enhance apparent path radiance, cause an overestimation of AOT, and as a result, an underestimation of surface reflectance in the VIS region [63]. The underestimation of SR by various AC approaches can be seen in the scatterplot, i.e., figures 4 -7. Although all AC approaches ensured correction of adjacency effects (Table 3), residual adjacency effects may persist, especially for pixels near clouds [64]. Therefore, lower R 2 and relatively higher errors (i.e., RMSE, MAE, and BIAS) found across all bands, AC approaches, and crop types may be a result of such residual effects. However, the effectiveness of adjacency correction in cloudy scenes as implemented in various AC approaches need to be evaluated in future studies. In addition to presenting challenges for atmospheric correction approaches, clouds also affect the usability of contaminated images [6]. Fortunately, each AC approach evaluated in this study performs cloud masking as part of the atmospheric correction procedure (Figure 3). This is an essential feature since Sentinel-2 Level 1-C cloud mask product is unreliable [56]. Thus, cloud masks generated by the evaluated AC approaches in this study offer prospects for generating cloud-free temporal composites and spectrally consistent time series data for agricultural applications. Besides, the cloud masks constitute a vital image quality information for ARD and data cubes. However, more and in-depth analysis of generated cloud masks by different AC approaches is recommended for future studies.
Band-wise error metrics, i.e., RMSE, MAE, and BIAS were consistently low, i.e., < 0.1 in the VIS and SWIR spectral regions, and higher, i.e., >0.1 in the Red-edge (RE) and NIR spectral regions across all AC approaches and crop types. Previous studies [28] found that SR derived from MAJA had relatively lower errors in bands 2 -5 and bands 11 and 12, however, all approaches were similar in performance in this study. Therefore, it can be expected that relatively lower error propagation in the SWIR bands should result in reciprocal lower errors in SVIs such as normalized difference water index (NDWI [65]), normalized burn ratio (NBR) and normalized difference nitrogen index (NDNI, [66]) designed around SWIR bands, thus improved monitoring and assessment of canopy water content [67], burned area and severity assessments [68,69], and canopy nitrogen content respectively. In general, the usefulness of SWIR bands for vegetation monitoring is increasingly being demonstrated in the literature [70][71][72]. Herrmann, Karnieli, Bonfil, Cohen and Alchanatis [73], for instance, argue that VIS/NIR and SWIR regions have a similar relationship in characterizing vegetation conditions. Although various factors control the reflectance in these regions, mainly chlorophyll absorption in B2 (blue) and B4 (red) and water absorption in B11 (SWIR1) and B12 (SWIR2) [74], these regions (i.e., VIS/NIR and SWIR) both exhibit very low vegetation reflectance. On the other hand, the errors in the RE bands 6 and 7, and NIR bands 8 and 8A were the highest, i.e., RMSE and MAE >0.2 with minor variations across different AC approaches. While this is concerning for the accuracy and reliability of RE-based SVIs such as the normalized difference red-edge index (NDRE), Clevers, Kooistra and Van Den Brande [75] show that SVIs based on VIS/NIR bands at 10m spatial resolution are superior to those based on RE bands in estimating LAI, leaf chlorophyll content (LCC), and canopy chlorophyll content (CCC), attributed to lower spatial resolution (i.e., 20m). Nevertheless, other studies demonstrate the importance of RE bands and related SVI for estimating LAI and canopy storage capacity [76], grass nutrient estimation [77] and grass discrimination [78]. The higher errors in NIR bands, on the other hand, have profound implications since most commonly used SVIs for agricultural monitoring including biomass and nutrient content estimation [79,80], crop type mapping [67,81], and yield estimation [80,82], are designed around NIR reflectance, e.g., NDVI, soil-adjusted vegetation index (SAVI), ARVI and EVI. Therefore, the results in this study suggest that such SVIs need to be used with caution, especially in cloudy conditions, i.e. >60%. After all, further validation is needed since the errors may have been due to the sensitivity of the RE and NIR regions towards the rapid changes agricultural landscapes that may have occurred in ±3 days difference between the date of acquisition of the Sentinel-2 image and in-situ reference SR, despite stable conditions in this study.

CONCLUSIONS
This study assessed the SR products from various atmospheric correction (AC) approaches for Sentinel-2 MSI in a semiarid agricultural landscape. Validation using in-situ surface reflectance measurement was carried out on a Sentinel-2 image with high cloud cover, i.e., >60%. Validation of Sentinel-2 MSI SR image with >60% cloud cover showed relatively high consistencies with in-situ surface reflectance when using MAJA, especially in the RE and NIR regions, attributable to its robust design that combines multi-spectral and multi-temporal temporal methods. However, the differences in R 2 between AC approaches in the RE and NIR regions were not significant, attributable to lower scattering effects in these regions than in the VIS region of the electromagnetic spectrum. On the other hand, the error metrics (RMSE, MAE and BIAS) were consistently lower in the VIS, band 5 (RE), and SWIR regions for MAJA, iCor, and FORCE, while Sen2Cor had better performance in the VIS region and band 5 (RE) only. Therefore, MAJA, iCor, and FORCE can be used to derive ARD for most agricultural applications, including those that use VIS and SWIR bands such as assessment of canopy water and nitrogen content, especially in cloudy conditions. On the other hand, the relatively high errors (i.e., RMSE >0.2) in the RE (bands 6 and 7) and NIR (band 8 and 8A) across all evaluated AC approaches have profound implications for the accuracy and reliability of SVIs that use these regions such as NDRE, SAVI, ARVI, and EVI, under partly cloudy conditions. Generally, crop-specific R 2 was better than the overall (i.e., combined) R 2 , indicating that overall metrics may mask essential relationships. Further validation in different cloud cover conditions, as well as the effectiveness of adjacency correction and cloud masking procedures, is needed in future studies.
providing data, i.e., freely available from the ESA data hub (https://scihub.copernicus.eu/dhus) and AfriCultuReS project Consortium for providing field data used in this study. We acknowledge the assistance provided by Mr. Johnny Rizos during fieldwork, and farmers who provided access to their fields. We appreciate the development teams for R packages; ggplot2, Metrics, etc. Last but not least, we highly appreciate the implementation of the BRDF correction method for Sentinel-2 in Google Earth Engine (GEE) accessible from (https://mygeoblog.com/2018/10/21/brdf-correction/). This research was conducted as part of AfriCultuReS project. "This project received funding from the European Union's Horizon 2020 Research and Innovation Framework Programme under grant agreement No. 774652".