Multitemporal and multiresolution leaf area index retrieval for operational local rice crop monitoring

This paper presents an operational chain for high-resolution leaf area index (LAI) retrieval from multiresolution satellite data specifically developed for Mediterranean rice areas. The proposed methodology is based on the inversion of the PROSAIL radiative transfer model through the state-of-the-art nonlinear Gaussian process regression (GPR) method. Landsat and SPOT5 data were used for multitemporal LAI retrievals at high-resolution. LAI estimates were validated using time series of in situ LAI measurements collected during the rice season in Spain and Italy. Ground LAI data were collected with smartphones using PocketLAI, a specific phone application for LAI estimation. Temporal evolution of the LAI estimates using Landsat and SPOT5 data followed consistently the temporal evolution of the in situ LAI measurements acquired on several Mediterranean rice varieties. The estimates had a root-mean-square-error (RMSE) of 0.39 and 0.51 m2/m2 in Spain and 0.38 and 0.47 m2/m2 in Italy for Landsat and SPOT5 respectively, with a strong correlation (R2 > 0.92) for both cases. Spatial-temporal assessment of the estimated LAI from Landsat and SPOT5 data confirmed the robustness and consistency of the retrieval chain. This paper demonstrates the importance of an adequate characterization of the underlying rice background in order to address changes in background condition related to water management. Results highlight the potential of the proposed chain for deriving multitemporal near real-time decametric LAI maps fundamental for operational rice crop monitoring, and demonstrate the readiness of the proposed method for the processing of data such as the recently launched Sentinel-2. © 2016 Elsevier Inc. All rights reserved.


Introduction
Green leaf area index (LAI) is a key biophysical parameter which represents half the total intercepting leaf area per unit ground surface area (Chen and Black, 1992). LAI plays an important role in vegetation processes such as photosynthesis and transpiration, and is connected to meteorological/climate and ecological land processes. LAI has been widely used in many agricultural and remote sensing studies (Carlson and Ripley, 1997;Clevers, 1988;Gower et al., 1999;Myneni et al., 1997). Concerning biomass and crop yield estimation, LAI estimates can be assimilated in crop models (Confalonieri et al., 2009) by means of forcing and/or recalibration techniques (Dorigo et al., 2007;Quaife et al., 2008). LAI retrieval from satellite data is among the main goals of the remote sensing community (Chen et al., 2002;Colombo et al., 2003;Fang and Liang, 2005) as evidenced by the variety and usefulness of operational medium resolution products for vegetation monitoring from satellite sensors, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) (Myneni et al., 2002) and the Système Pour l'Observation de la Terre (SPOT) VEGETATION (Baret et al., 2007. Nevertheless, higher spatial resolutions (10-30 m) are needed to support crop management activities at a parcel level. In this context, the Landsat Data Continuity Mission (LDCM) (Roy et al., 2014) and the recently European Sentinel-2 Mission (Drusch et al., 2012;Malenovský et al., 2012) provide valuable high-resolution (HR) information for a wide variety of land applications (Malenovský et al., 2012) including crop monitoring.
From a methodological point of view, LAI retrieval can be faced following either statistical, physical, or hybrid methods (Camps-Valls et al., 2011;Verrelst et al., 2015a;Campos-Taberner et al., 2015b). Parametric  derived from airborne and satellite spectra (Haboudane et al., 2004). Alternatively, non-parametric methods do not assume an explicit (parametric) relation between LAI and the spectral reflectance. Non-parametric models estimate a variable of interest using a training database of biophysical parameter and spectral data pairs. In the last decade, non-parametric techniques excelled in biophysical parameter retrieval, either following pure statistical or hybrid approaches. Many methods have been used in a wide range of applications (Atzberger and Richter, 2012;Verrelst et al., 2015a). Specifically, current operational vegetation products such as LAI are typically produced with neural networks (NN) (Baret et al., 2007. Nevertheless, in the recent years, Gaussian process regression (GPR) (Rasmussen and Williams, 2006) provided encouraging results in the framework of biophysical parameter estimation outperforming the rest (Camps-Valls et al., 2016;Lazaro-Gredilla et al., 2014;Verrelst et al., 2012aVerrelst et al., , 2015b. In statistical approaches, concomitant in situ measurements of the biophysical parameter of interest and the associated spectral data from remote sensing platforms are used as a training database, whereas the hybrid approaches rely on a database generated by a radiative transfer model (RTM). The advantage of hybrid approaches is that a broad range of land cover situations can be simulated (e.g. up to hundred thousands), leading to a data set much bigger than what can be collected during a field campaign. RTMs are based on the physical knowledge describing the interactions between radiation, canopy elements, and the soil surface. RTMs have been used for modeling different types of vegetation, making them suitable for general-purpose retrieval applications (Jacquemoud et al., 2000). In particular, the PROSAIL RTM (Jacquemoud et al., 2009), which results from the PROSPECT leaf optical model (Jacquemoud and Baret, 1990) coupled with the SAIL canopy reflectance model (Verhoef, 1984), has been used in several remote sensing studies. PROSAIL has been successfully applied to a variety of crops (Duan et al., 2014). However, it is worth noting that RTM inversion poses several methodological problems: it may lead to high computational cost, and being an illposed problem, it may give rise to unstable results. Prior information related to the distribution of the canopy variables and representative background spectra can be implemented in the RTM to better address the inversion process (Combal et al., 2003;Meroni et al., 2004).
Direct validation comparing LAI predictions with in situ LAI measurements is needed to report the accuracy of LAI retrievals. LAI ground measurement methods can be divided into two categories: direct and indirect methods (Breda, 2003;Jonckheere et al., 2004). Direct methods involve destructive techniques and require a field effort in collecting leaf samples. Due to the difficulty of continuous applications of direct methods on a large scale, the use of indirect methods based on measurements of the transmission of radiation through the canopy have been widely used . In this context, to exploit new technologies in crop monitoring, smartphones have being used for indirect rice LAI measurements reporting good consistency and performance compared with classical instruments such as LAI-2000 or LAI-2200 Plant Canopy Analyzers (LI-COR, Inc., Nebraska, USA) and digital hemispherical photography (DHP) (Campos-Taberner et al., 2016).
In this paper, we present an operational chain for LAI retrieval from Landsat and SPOT5 satellite data specifically calibrated for rice crops. One specific characteristic of rice cultivation is the land preparation followed by the pre-season flooding, this agronomic practice characterizes worldwide the majority of rice cropping systems (namely, irrigated rice, lowland rain-fed rice and deep-water rice) which account for over 90% of the 154 million ha (Maclean, 2002) cultivated with rice each year. In the European temperate rice cultivation areas, the background of the fields is dry at the beginning of the season, and it remains in this condition until early-May when the fields are starting to be flooded. From this date on, the soil background is flooded most of the time except in some dates when the farmers pump out the water for agronomic purposes. From a remote sensing point of view, this characteristic determines a strong change in the soil background conditions from dry soil to standing water (Boschetti et al., 2014). Thus, the intermittent flooding in a paddy rice field generates an uncontrolled reflectance signature, which may confound the retrieval of rice LAI. It is therefore relevant to test methods for this specific crop due also to its worldwide importance.
In this study, LAI is estimated using hybrid methods through the generation of a reflectance and associated LAI database from the PROSAIL model and powerful nonlinear inversion methods, such as neural networks, kernel ridge regression (KRR) and Gaussian process regression. Based on experimental performance in our datasets, we focus on GPR. This work uses for the first time GPR as the regression tool for multitemporal LAI production chain during the rice cycle. This study shows results produced in the framework of the ERMES project (http://www.ermes-fp7space.eu/) where rice monitoring is performed exploiting seasonal remote sensing data and crop modeling as a demonstration of potential operation system. Actually, Landsat data were processed in near real time and the corresponding LAI maps were provided through the web-based geoportal in the project framework to the crop modelers. Contrarily, SPOT5 Take5 data were analyzed in backcasting to assess the potential use of Sentinel-2 data. Spatial and temporal consistency of the LAI estimates were validated with the available multitemporal in situ measurements collected in two ERMES study areas (Italy and Spain) from sowing up to rice LAI peak.
The remainder of the paper is organized as follows. Section 2 describes the study areas for which LAI maps are derived, the in situ LAI measurements and the remote sensing surface reflectance data. The parameterization of the PROSAIL model, as well as the theoretical basis of the regression methods, are presented in Sections 3.1 and 3.2 respectively. Section 4 evaluates the spatial-temporal consistency of the obtained LAI estimates and provides an analysis of LAI trends for different varieties and management practices as well as discusses the impact of rice background in multitemporal LAI retrievals. Finally, Section 5 outlines the main conclusions and discusses the utility of the obtained results in the framework of operational rice monitoring systems.

Study areas
ERMES aims to develop a prototype of COPERNICUS down-stream services based on the assimilation of Earth observation (EO) and in situ data within crop modeling solutions dedicated to the rice sector. In this framework, the ERMES study areas have been selected in Spain (Valencian area), Italy (Piedmont and Lombardy rice district) and Greece (Thessaloniki area), which are the three countries responsible of 85% of total European rice production. In this study, we focus on the Italian and Spanish ERMES local study areas. Within each study area, rice is a common crop with a long tradition and economical value. Rice varieties in the Spanish area are mainly Bomba and Senia belonging to the Japonica group. Sowing activities are concentrated around May 10-15th and fields are managed by keeping them flooded for most of the time during the rice growing period. In Italy, the rice cropping systems are much more variable. About 180 varieties are cultivated covering both Japonica and Indica groups characterized by short and long cycles with a duration from 120 to >150 days, respectively (Boschetti et al., 2009). For this reason, the sowing date can vary from the beginning of April to mid of May. Moreover, in Italy, the so-called dry seeding technique is increasing year after year. This technique consists of seeding rice in rows with a common seeder in dry soil condition and held without water until the unfolding of 2-4 leaves, and then crops are flooded. In 2014, it was estimated that more than 30% of the rice cultivated area adopted this technique.
The Spanish study area is located in selected farms within the rice district of Valencia, East of Spain (see Fig. 1 (top)) belonging to the Albufera Natural Park, which is included as a special protection area in the Natura 2000 network by the European Commission allowing only rice crop practices (European Commission, 2011). The area has a typical Mediterranean climate, with an average annual temperature and humidity of 17 • C and 65%, respectively. The mean annual precipitation is approximately 430 mm, mainly recorded in autumn. The Spanish site is a homogeneous rice planting area of approximately 10 km × 20 km extension. The Italian study area is the Lomellina rice district, which is located in the south-western Lombardy region, between the Ticino, Sesia and Po rivers where rice is the dominant crop (>90%). It is the rural part of the Pavia province, which includes 58 municipalities, and is particularly renowned for its rice cultivation (see Fig. 1 (bottom)).

Ground measurements
In the framework of the 2015 ERMES activities, LAI ground measurements were conducted over the previous study areas. Based on land cover distribution in the areas and information provided by farmers at the very beginning of the rice season, a reliable sampling was achieved selecting ESUs (elementary sampling units) with different rice varieties and sowing dates in order to cover as much as possible the variability of the study areas. In Spain, the total number of ESUs was 40. For some of the ESUs, field acquisitions were made in all field campaigns, whereas in other ESUs the temporal frequency was approximately one every two field campaigns. In the Italian study area, 10 ESUs were fixed and considered during all the field campaigns, increasing the number of ESUs up to 19 for some dates (see Table 1). The same sampling scheme was used over each ESU, following the guidelines and recommendations of the Validation of Land European Remote sensing Instruments (VALERI) protocol (http://w3.avignon.inra.fr/valeri/). For the case of row crops, this protocol suggests to take measurements along small transects between rows incorporating some random acquisitions to prevent possible biases in the characterization of the row effect. We adopted the same schema also for fields with broadcast seeding. The size of the ESUs was approximately 20 m × 20 m, and the locations were far from the field borders. In order to characterize the spatial variability within each ESU, a range of 18 to 24 measurements was taken. This number of replicates allows to obtain a statistically significant mean LAI estimate per ESU. The center of the ESU was geo-located using a GPS for later matching and associate the mean LAI estimate with the corresponding satellite spectra.
LAI estimates were acquired in the two countries with smartphones using an app called PocketLAI (Confalonieri et al., 2013). PocketLAI computes indirect LAI measurements through the segmentation of images acquired at 57.5 • below the canopy and showed good performance in canopies with different structures (Francone et al., 2014). PocketLAI estimates can reproduce destructive LAI measurements with acceptable results in terms of both reliability and accuracy (Confalonieri et al., 2013). A schematic of a single LAI measurement using PocketLAI and the theoretical background are shown in the supplementary material (see Appendix A). Since the gap fraction-LAI relationship is not linear, it is not equivalent to first average the gap fraction and then estimate the LAI than the contrary . However, specifically over rice crops, we have recently shown that LAI measurements taken with PocketLAI align well with other traditional acquisition techniques, such as plant canopy analyzers and digital cameras for hemispherical photography (Campos-Taberner et al., 2016). Together with PocketLAI data, ancillary information such as rice phenology and flood condition was taken on each ESU to better evaluate the computed LAI maps.

Landsat imagery
The United States Geological Survey (USGS) facilitates free access to Landsat archive data (Woodcock et al., 2008). In this work, Landsat 8 OLI data (30 m pixel resolution) were downloaded through the USGS Earth Resources Observation and Science (EROS) Center Science Processing Architecture (ESPA) (http://espa.cr.usgs.gov/). The provisional Landsat 8 Surface Reflectance product was used (LaSRC) . In order to focus on the rice areas only, the Landsat 8 OLI images were cropped to 1500 × 800 and 480 × 431 pixel size in the case of Spain and Italy, respectively. Images were available every 16 days in Italy. The Spanish study area lies in two Landsat paths within the same row (198/33 and 199/33), increasing the temporal resolution of the images to seven and nine days. Landsat 8 OLI surface reflectance spectral bands were filtered to relate only the blue (B), green (G), red (R), near infrared (NIR), and the two short wave infrared (SWIR1, SWIR2) channels with the ground LAI measurements in the retrieval process.
Cloud contamination is a common problem which limits the utility of passive optical multispectral images. To deal with this problem, Landsat 7 ETM+ data were used for increasing the temporal resolution of cloud free images. Nevertheless, Landsat 7 data can be affected by data gaps causing lost information of about 22% of the pixels in ETM+ images (referred to as SLC-off images) (Arvidson et al., 2006;Ju and Roy, 2008). In this study, we used the Neighborhood Similar Pixel Interpolator (NSPI) algorithm (Chen et al., 2011), which makes use of the neighboring pixels with similar spectral characteristics to predict the value of missing pixels using spatial and temporal information of gap free images (Campos-Taberner et al., 2015a). To compute the spectral similarity between the target pixel and neighboring pixels, only cloud-free observations are used, thus exploiting information about the cloud mask and the temporal trajectory. Fig. 2 shows the available cloud-free Landsat imagery acquired during the 2015 rice season over the study areas.
It is worth noting that Landsat 7 ETM+ and Landsat 8 OLI have different spectral response functions. This leads to slight differences in surface reflectance for each sensor. Nevertheless, differences are low when surface reflectance is simulated (see Fig. S2 of the supplementary material in Appendix A). On the other hand, another source of discrepancies in LAI retrievals is the atmospheric correction used for the generation of Landsat 7 ETM+ and Landsat 8 OLI surface reflectance data. Although the Landsat 7 ETM+ atmospheric correction (LEDAPS) is less accurate than the Landsat 8 OLI atmospheric correction  differences in surface reflectance and LAI estimates of two consecutive images are small (Roy et al., 2016).

SPOT5 imagery
With the aim of simulating multitemporal data from Sentinel-2 mission, the SPOT5 Take5 experiment provided 10 m pixel resolution data over selected sites every 5 days under constant angles from end-April to early-September 2015, thus covering the majority of the vegetation phase for summer crops in Europe. In this framework, ESA launched a call for proposals of sites selection, and the ERMES study areas were included in the 2015 SPOT5 Take5 acquisition plan. A total of 17 and 18 SPOT5 cloud-free images were acquired over the Spanish and the Italian study areas (see Fig. 2). The imagery were downloaded through the Theia land data center, which provides a top of canopy surface reflectance product (green, red, near infrared and short wave infrared channels) obtained applying an ortho-rectified process (Baillarin et al., 2008) and then computing an atmospheric correction with MACCS Software . Images were spatially cropped to 1440 × 1293 (Italy) and 4500 × 2400 (Spain) pixel size in order to match with the extent of the corresponding Landsat imagery.
Note that pixels corresponding to urban areas, water bodies and areas of 'no interest' such as non-rice crops were masked out during the retrieval process in order to avoid meaningless LAI estimates over those surfaces. The masking process was realized using official parcel boundaries and farmers crop declaration for CAP (European Common Agricultural Policy) obtained from the Valencian government (http:// terrasit.gva.es/) and the Lombardy regional authorities (https:// www.dati.lombardia.it/browse?category=Agricoltura).

Retrieval methodology
A general outline of the proposed retrieval chain is shown in Fig. 3. The basic ingredients include the PROSAIL model and statistical regression algorithms for model inversion. Once the regression models are developed, we apply them to retrieve high-resolution LAI maps on the study areas from corresponding Landsat and SPOT5 surface reflectance data. In the following subsections we describe these components in detail.

PROSAIL model
The PROSAIL radiative transfer model (Jacquemoud et al., 2009) was used to build the database for training the retrieval model. It assumes the canopy as a turbid medium for which leaves are randomly distributed. PROSAIL simulates the top of canopy bidirectional reflectance in the range of 400 nm to 2500 nm as a function of input variables related to the structure of the canopy, the leaf optical properties, the background soil reflectance and the sun-view geometry. Leaf optical properties are expressed taking into account the mesophyll structural parameter (N), leaf chlorophyll (C ab ), dry matter (C m ), and water (C w ) contents. C w was tied to the dry matter content (C w = C m × C wREL /(1 − C wREL )) assuming that green leaves have a relative water content (C wREL ) varying within a relatively small range (Baret et al., 2007) (see Table 2). On the other hand, canopy structure is characterized by the average leaf angle Fig. 3. Operational chain followed in the multitemporal LAI retrieval.
25 Gaussian inclination (ALA), the LAI and the hot-spot parameter (Hotspot). A multiplicative brightness parameter (b s ) was introduced and applied to spectral flooded and dry soil signatures to represent different background reflectance types (Baret et al., 2007;Claverie et al., 2013). The system geometry was described by the solar zenith angle (h s ), view zenith angle (h v ), and the relative azimuth angle between both angles (DH). Sub-pixel non-vegetated areas were found in the borders of rice fields but patches of bare/flooded soil, small water stripes and channels were found in paddies as well. These conditions can be due to water drainage, very poor soil fertility, flattening mechanical process and other causes related to agro-practices leading to high yield reduction. The interested reader is referred to Fig. S4 of the supplementary material attached in Appendix A for further details. Therefore, in order to account for these mixed conditions, a pixel can be represented by a linear mixture of vegetation (vCover) and bare/flooded soil (1-vCover) spectra. A linear spectral mixing model was assumed for the sake of simplicity. Taking this heterogeneity into account, the pixel reflectance can be expressed as R = R veg × vCover + R soil × (1 − vCover), where R veg and R soil account for pure vegetation reflectance and background, respectively. This simple approach was introduced by Baret et al. (2007) to account for clumping at the landscape level. Note that when validating retrieved LAI of a mixed pixel, it is compared with LAI × vCover (see distributions of LAI and vCover in Table 2). In this study, vCover was assumed to be independent of LAI, following a truncated Gaussian distribution. In addition, a 5% of pure background spectra (vCover=0) were added to represent situations at the beginning of the season (no vegetation) and large patches of bare/flooded soil present during the rice season.
The leaf and canopy variables as well as the soil brightness parameter were randomly generated following specific distributions (see Table 2). The parameterizations were similar to other studies using high-resolution sensors (Bsaibes et al., 2009;Duveiller et al., 2011). Nevertheless, in this study, a site-specific parameterization of the PROSAIL model based on the available 2014 ERMES field measurements was selected in order to constrain the behavior of the model to Mediterranean rice areas reducing the equifinality of the ill-posed PROSAIL inversion process (Combal et al., 2003). During the ERMES 2014 rice season, leaf chlorophyll content was measured with a SPAD-502 chlorophyll meter (Campos-Taberner et al., 2016) thus allowing to constrain the C ab range to Mediterranean rice values which typically vary from 35 to 70 lg • cm −2 depending on the rice variety. This range was slightly extended in order to represent rice with high leaf chlorophyll concentration due to fertilization practices, as well as low leaf chlorophyll content caused either by possible diseases, blasts or nitrogen deficits. ALA distribution was selected for accounting specific leaf inclination during rice phenological stages (Zhang et al., 2013). In order to better constrain the retrieval to rice crops, a spectral library of underlying soil background was generated by considering signatures of homogeneous flooded and dry areas identified within rice fields in the study areas with Landsat and SPOT5 imagery, which was spectrally matched with typical rice background spectra collected by Boschetti et al. (2014). Each background signature was selected randomly from the spectral library and multiplied by the soil brightness factor (b S ), which was assumed to follow a Gaussian distribution (see Table 2). b S is assumed to be sensitive to soil moisture, roughness and geometrical configuration (Baret et al., 2007). A random Latin hypercube sampling design allowed to populate more evenly the canopy realization space (Mckay et al., 2000). For each sensor, a single dataset of PROSAIL simulations was performed, which included all geometrical configurations. The distributions for the system geometry were randomly generated based on information in imagery metadata.

Regression methods
In this paper we propose inverting PROSAIL using machine learning statistical algorithms. For this purpose, we used three representative nonlinear regression methods: the familiar artificial neural network, and two related kernel-based regression algorithms: the kernel ridge regression and the Gaussian process regression. This section reviews the three regression methods used and discusses about the implementation issues for the interested reader.

Neural networks (NN)
Artificial neural networks are based on the combination of simple nonlinear processing units, called neurons, into a fully connected hierarchical architecture. The network can model complex, nonlinear input-output relations, and has been the preferred regression and function approximation tool for decades for retrieving biophysical parameters. Actually, the vast majority of hybrid inversion methods consider the use of neural networks (Baret et al., 1995;Baret and Fourty, 1997;Smith, 1993) for retrieval of canopy parameters. Essentially, each neuron in a network performs a linear regression followed by a non-linear activation (sigmoid-like) function. Neurons of different layers are interconnected by weights that are adjusted during the training phase (Haykin, 1999). In order to train the network (i.e. fit the weights), one has to select a cost function (in our case the least squares loss) and an algorithm to do this (in our case the standard back-propagation algorithm). Several hyperparameters are involved as well and impact the solution: essentially, the number of hidden layers and neurons/nodes and the learning rate of the back-propagation algorithm.

Kernel methods
Kernel methods (Shawe-Taylor and Cristianini, 2004) owe their name to the use of kernel functions, which measure similarities between input data examples. We used two related and powerful kernel methods for regression: kernel ridge regression (Shawe-Taylor and Cristianini, 2004), and Gaussian process regression (Rasmussen and Williams, 2006). The KRR is considered as the nonlinear (kernel) version of the canonical least squares linear regression, while the GPR is a probabilistic approximation to nonparametric kernel-based regression, where both a predictive mean (point-wise estimates of LAI) and predictive variance (error bars for the LAI predictions) can be derived. Notationally, both methods offer the same explicit form of the predictive model, which establishes a relation between the input (e.g., spectral data) x = [x 1 , . . . , x B ] ∈ R B and the output variable (i.e., LAI) y ∈ R of the form: are the spectra used in the training phase, a i is the weight assigned to each one of them, a o is the bias in the regression function, and k h is a kernel or covariance function (parametrized by a set of hyperparameters h) that evaluates the similarity between the test spectrum and all N training spectra.
In order to generate a kernel regression model, one needs to specify a covariance/kernel function k h , to infer its hyperparameters h and model weights a. For the KRR prediction model, we used the squared exponential (SE) kernel: k(x i , x j ) = exp(− x i − x j 2 /(2s 2 )), which is simply parameterized by h = s (also known as the kernel length-scale) that needs to be tuned typically by cross-validation.
For the GPR prediction model, we used the so-called automatic relevance determination (ARD) kernel, as an alternative generalization of the isotropic SE prior: where m is a scaling factor, B is the number of bands, and s b is a dedicated parameter controlling the spread of the relations for each particular spectral band b. Model hyperparameters are collectively grouped in h = [m, s n , s 1 , . . . , s B ], and model weights a i can be automatically optimized by maximizing the marginal likelihood in the training set (Rasmussen and Williams, 2006;Verrelst et al., 2012b). The obtained weights a i after optimization give the relevance of each spectrum x i , while the inverse of s b represents the relevance of each band b. Hence, low values of s b indicate a higher informative content of this certain band b to the training function k. In this paper we study both the KRR and the GPR models paying attention to the band relevance conveyed by the inferred s b values, and the prediction uncertainty provided by the GPR model.

Model development and source code
Inference of the hyper-parameters for all methods and the weights for doing predictions was done as follows. We first generated 2000 data pairs (reflectances-LAI values) with PROSAIL, and used 70% for model selection (training set), and then evaluated and showed the results in the remaining 30% test set, which was never used or seen in model fitting. Even though 1400 samples could seem to be insufficient for training purposes, increasing this number of samples did not have a significant impact in the accuracy of the retrievals for all methods in the test set, indicating that they did not incur in any overfitting issue and highlighting the good representativity of the simulated data. For neural networks, the hyperparameters to be chosen were the number of hidden layers and neurons (for simplicity we evaluated one hidden layer and 2-30 hidden neurons) and the learning rate between 0.001 and 0.1 in log-scale. The bias input was set to −1 (not adjusted during training) and different initializations of the weights were tested. For the KRR model, we varied the length-scale s (between 0.1 and 10 times the average distance between all training points in log-scale) and the regularization parameter s n (between 10 −5 and 10 −2 in log-scale). For the case of the GPR, we inferred the hyperparameters h = [m, s n , s 1 , . . . , s B ] and model weights using an optimization of the evidence. All experiments were conducted with our SimpleR MATLAB toolbox, freely available at http://isp.uv.es/. The toolbox is intended for practitioners with little expertise in machine learning, and that may want to assess advanced methods in their problems easily. The toolbox compares numerically and statistically the algorithms by simply entering the input-output (e.g. reflectances and LAI values) data matrices.

Results and validation
This section is devoted to show the theoretical performance of the regression methods and experimental evidence of the performance of the proposed processing scheme. We pay attention to the derived HR LAI maps, and illustrate the usefulness for multitemporal rice crop monitoring through a temporal evolution analysis of LAI estimates compared to in situ measurements in the different test sites.

Accuracy assessment over the simulated dataset
It was necessary to build a dedicated model for each of the three remote sensing datasets (Landsat 7 ETM+, Landsat 8 OLI and SPOT5) depending on their spectral bands and angular configurations. We first evaluate the GPR performance compared to NN and KRR (see Fig. 4) over the test dataset (600 samples) corresponding to Landsat 7 ETM+, Landsat 8 OLI and SPOT5 datasets (Landsat 7 ETM+ results were similar to Landsat 8 OLI and are not shown for brevity). Hereafter, we will refer simply to 'Landsat' estimates, irrespectively of being for Landsat 7 ETM+ or Landsat 8 OLI. Likewise, we refer to GPR-Landsat and GPR-SPOT5 to the GPR models built using Landsat and SPOT5 data, respectively. Results revealed good accuracy and low bias in both Landsat and SPOT5 simulated reflectances. GPR outperformed NN and KRR in all statistical quality measures (see Fig. 4). GPR-Landsat and GPR-SPOT5 performances were robust and very similar, revealing biases of 0.02 and 0.03 m 2 /m 2 respectively, while a root mean squared error (RMSE) of 0.78 m 2 /m 2 and high determination coefficients (R 2 > 0.87) were obtained in both cases.
Fig. 5 exhibits the evolution of the average RMSE as a function of the number of predictions, and shows a consistent better performance (i.e. improved robustness to reduced-sized datasets) of GPR over NN and KRR. The curves are the result of averaging a number of realizations, and for each realization, we computed the RMSE with a fixed number of predictions chosen at random. In the limit of 100%, one obtains the results in Fig. 4. This particular plot tries to analyze models's robustness to local consistency and reliability of the estimated error (Montavon et al., 2013). The shape of the obtained curves are similar for all methods (high variance when few predictions were evaluated, and rapid convergence to stable RMSE), yet better for GPR. These results revealed GPR as the most accurate and robust regression method in both Landsat and SPOT5 test sets.
As discussed earlier (cf. Section 3.2), GPR provides a s b parameter whose reciprocal represents the relevance of each band in the regression. Hence, it is possible to identify the most relevant bands used by the GPR model, which it is not affordable when using NN and KRR. Specifically, the most relevant bands in both cases were the green and near infrared bands (see Fig. 6), which showed the theoretical consistency between GPR model behaviors. The identification of the most valuable bands for LAI retrieval was expected since the green and near infrared bands are more related to the greenness of the vegetation (useful for assessing plant vigor) and leaf area index features, Fig. 5. Performance in the test set averaged over 100 random realizations of the Landsat and SPOT5 training-test data splitting as a function of the number of used predictions. respectively. Similar band ranking was obtained by Verrelst et al. (2015b) although the results are not directly comparable because of the different spatial and spectral resolutions.
In this work, all six Landsat optical bands were used for the GPR-Landsat LAI retrievals although in many studies the blue band is not used because atmospheric effects. Nevertheless, blue band was also considered for LAI retrieval in other studies (Atzberger and Richter, 2012;Borel, 2010;Verrelst et al., 2015b) since the blue spectrum holds information valuable for LAI and phenology (Huete et al., 2002). Actually, GPR is very robust to moderate-to-high dimensional spaces (i.e. few more input variables do not impact results negatively). A comparison of LAI estimates obtained with or without using the Landsat blue band revealed a slight improvement when including the blue band in the GPR retrieval (see Fig. S3 and Table S1 of the supplementary material in Appendix A).

Accuracy assessment over ground LAI
With the goal of assessing the accuracy of the retrievals, RMSE between the estimates and the in situ measurements was computed. RMSE values of 0.39, and 0.38 m 2 /m 2 were found in Spain and Italy respectively, showing good accuracy between GPR-Landsat map values and the in situ LAI measurements (see Fig. 7). The GPR-SPOT5 retrievals revealed good accuracies as well, showing RMSE values of 0.51 and 0.47 m 2 /m 2 in Spain and Italy respectively. Other different statistics, such as the mean error (ME), mean absolute error (MAE) and the coefficient of determination were also computed to evaluate the bias, accuracy, and the goodness-of-fit between GPR-Landsat and GPR-SPOT5 predictions and measurements. A remarkably good correspondence between satellite retrievals and in situ measurements was found in the Spanish site, with very low bias for both  GPR-Landsat and GPR-SPOT5 models. A good agreement and low biases were also observed for the Italian site. In all cases, very high correlations were found with R 2 > 0.92. Error bars in Fig. 7 refer to the standard deviation of the field measurements, and are thus related with the heterogeneity of the ESU. In addition, retrievals were also computed using KRR and NN showing slightly less accurate LAI estimates. The interested reader is referred to Fig. S5 of the supplementary material attached in Appendix A.

Comparison of GPR-Landsat and GPR-SPOT5
Besides the aforementioned theoretical performance of the GPR-Landsat and GPR-SPOT5, the experimental consistency between estimates was assessed with the inter-comparison of multitemporal LAI estimates. The closest acquisition dates between Landsat and SPOT5 were taken into account for the comparison. After the SPOT5 resampling to Landsat resolution, the comparison was achieved averaging the LAI value of the valid estimates computed over the 3 × 3 pixels if more than 5 out of the 9 pixels were valid (Morisette et al., 2006) in order to reduce coregistration errors between images and inconsistencies associated to differences in the point spread functions. In general, the estimates provided by both GPRs were highly correlated. However, a slight overestimation of the SPOT5 based retrievals was observed in both study areas (see Fig. 8). In addition, spatial consistency between retrievals was performed for the closest temporal Landsat and SPOT5 images during the rice growing season. The closest useful available images were found in Italy in 2015 July 22nd for Landsat and in 2015 July 21st in the case of SPOT5. Difference (D = GPR SPOT5 − GPR Landsat ) LAI map was computed after the GPR-SPOT5 LAI map resampling into Landsat resolution (30 m). Statistical differences between GPR-SPOT5 and GPR-Landsat from ANOVA (one-way analysis of variance) were computed revealing F-statistic and p-value of 2.860 and 0.001 respectively, highlighting that the two distributions are not statistically different. The obtained results show 77% of the pixels fall within ±0.5 m 2 /m 2 interval and only 1% of the pixels reveal differences higher than ±2 m 2 /m 2 (see Fig. 9 (right)). The prevailing light green color in Fig. 9 (left) demonstrate that the retrievals from the two sensors are coherent and most of the rice areas have  LAI differences around zero. Bigger differences can be partially due to the different spatial resolution between Landsat and SPOT5.

Spatio-temporal analysis
Multitemporal LAI maps were derived for the study areas during the 2015 rice season. Fig. 10 shows the GPR-Landsat and GPR-SPOT5 LAI maps derived for the very beginning of season (mid-May), early growing season (early-June), and maximum leaf rice development (mid-August) in the Spanish site. The corresponding HR estimated LAI maps for the Italian study area are shown in Fig. 11. First inspection of the maps indicates the occurrence of very low LAI values corresponding to mid-May since the sowing dates were around May, 10-15th in Spain. In Italy, for the same period, some rice fields have higher LAI values (>2) because of the early sowing of some rice varieties. In the Spanish site, it can be seen the expected rice emergence in the early-June LAI maps (see Fig. 10 (middle)), while the early-July Italian maps (Fig. 11 (middle)) show already higher LAI estimates due to the advanced phenological growing state corresponding to the rice stem elongation phase. Eventually, the mid-August maps (Figs. 10 and 11 right panels) show the highest LAI estimates because rice plants reached the heading phase and the LAI seasonal peak. All 2015 derived high-resolution maps can be investigated by registering and joining as local user into the ERMES web-based geo-portal (http:// ermes.dlsi.uji.es/prototype/geoportal/) or can be found in the ERMES catalogs (http://get-it.ermes-fp7space.eu/).
The in situ LAI data points allowed us to compare the temporal evolution of field measurements over the study areas (see Fig. 12). In general, LAI estimates derived from the proposed algorithm using GPR-Landsat and GPR-SPOT5 data agree with regard to the seasonal rice phenological cycle in the two countries and followed the temporal dynamics of the ground measurements. The different LAI evolutions that can be observed in Fig. 12, show coherent temporal behaviors as a consequence of either different rice varieties or sowing dates that determine a shift in the development curve. The interested reader is referred to Fig. S6 of the supplementary material attached in Appendix A for further details. It is interesting to note that the difference between Landsat 7 ETM+ and Landsat 8 OLI did not induce large difference in LAI retrieval. For example, LAI estimates of Landsat-8 DOY=190 and Landsat-7 DOY=191 for the Spanish site showed differences lower than 0.25 m 2 /m 2 (see blue profiles in Fig. 12).
On the other hand, both GPR-Landsat and GPR-SPOT5 models also provided a prediction uncertainty for the LAI estimates (see Fig. 12). Although these uncertainties cannot be used as a validation per se, they can be useful to draw conclusions about the quality of the Fig. 11. Estimated LAI maps derived using Landsat (top) and SPOT5 (bottom) images over the Italian study area in mid-May (left), early-July (middle) and mid-August (right). Grey mask covers non-rice areas. retrievals. Apparently, the prediction uncertainties look constant for the multitemporal LAI estimates, nevertheless a deeper look reveals a change in the behavior: at the beginning of the rice season (no vegetation) values are high (≈1.25) and when rice starts to emerge (about day of year 140-150) the uncertainty decreases significantly (≈0.95), while during the rice development period it remains virtually constant. This reflects the fact that the simulations include a large amount of cases with intermediate and high LAI values. It is worth mentioning that the GPR prediction uncertainty only depends on the training reflectance. Thus, if the test and training reflectance are deemed similar, the uncertainty for the test set decreases because we are dealing with similar input features. Therefore, prediction uncertainty must be interpreted as a qualitative variable associated to the estimates: the higher the uncertainty, the lower the confidence on the associated estimate. As a matter of fact, this interpretation comes from the PROSAIL training data set used in the GPRs since unrepresented spectra, such as non-vegetated bodies, lead to higher uncertainties for its associated retrievals. In Section 4.5, we analyze the impact of using different background spectra in PROSAIL on LAI prediction uncertainties.
The rate of LAI development depends on the occurrence of specific phenological phases. Green-up corresponds to the end of the tillering phase, the rapid LAI increase occurs during stem the elongation phase and the plateau of maximum LAI is reached in correspondence of heading for all the flowering period. From the analysis of LAI profiles for different ESUs, especially in Italy, it is possible to appreciate a variability in the occurrence of the mentioned phenological stages due to different sowing dates and cultivated varieties. A deeper intercomparison of LAI time series derived from GPR-SPOT5 in the Italian case study was conducted over selected monitored fields to better exhibit these behaviors. Fig. 13 reports the average time series of LAI for each monitored field in 2015. A buffer of 5 m was considered in order to get rid of the field border in the average computation. In the figure, the different monitored parcels are indicated in numerical order and different varieties in color (see legend), dotted vertical lines represent the date of sowing and the sowing practices as provided by the farmers (i.e. direct sowing in dry soil or seed broadcasting in flood condition in red and blue color respectively). It is possible to appreciate how rice is usually cultivated in monoculture and sowed in dry bare soil or in water (low/zero LAI value at the beginning of the series). It is noteworthy that anomalous high early LAI values in some fields (fields # 68 or # 384) are indicative of a cover crop preceding the rice season (observation confirmed by farmers communication). An early crop establishment period, about DOY 92 (fields # 50 and # 55), determined crop presence (e.g. LAI value) around DOY 130 to 150 when other fields (fields # 61, # 282 and # 384) were just sown. As a consequence of different sowing dates and varieties, the periods of maximum LAI plateau and senescence occur in different moment. Those fields with an early sowing (DOY 90 to 111) reached the LAI peak around DOY 180 whereas the variety sowed around DOY 140 (Sirio CL, Selenio) reached the LAI peak after DOY 200. The interested reader is referred to Fig. S7 in the supplementary material attached in Appendix A for further analysis on LAI temporal behavior according to sowing date and cultivated variety.
The analysis of profiles reveals that LAI estimates are in agreement with the qualitative behavior of rice growth in the different conditions. LAI trends do not show artifacts due to changes in spectral background conditions related to water management. When soil changes from dry and flood condition (or vice versa) in the early growing stages (e.g. for dry sowing field # 106 and # 118; for flood sowing field # 62 and # 102) anomalies in LAI retrievals can be appreciated. The small fluctuations within each temporal profile are mainly due to the residual of atmospheric contamination; for example the SPOT5 image acquired at DOY 142 shows a haze pattern from east to west that can justify the small peak in several plots both in flood (# 102) and dry conditions (# 120). The LAI values of these peaks/drops are anyway within the expected uncertainty of the retrieval.
These results highlight the usefulness of the retrieved LAI time series at high resolution to perform field level crop monitoring and site specific assimilation in crop models.

Assessing the influence of rice background in LAI retrievals
In this study, we used a representative set of background spectra taking into account all possible expected conditions of the field. The influence of background characteristics in LAI retrievals was assessed introducing different types of background spectra in the training database. Firstly, we trained and inverted the PROSAIL model using typical spectra of dry bare soils which were present at the very beginning of the rice season. Secondly, we used a set of flooded soil signatures representing typical background conditions during the rice growing (from sowing to harvest). Eventually, we used both dry and flooded spectra. Note that during these experiments the distributions of PROSAIL parameters for the crop were kept constant, and only the set of background spectra was changed (i.e. dry, flooded and both dry + flooded conditions). Table 3 exhibits the statistics when estimates from Landsat over Spain are compared with in situ LAI data. Similar results were obtained in the case of Italy and SPOT5 data (results not shown for brevity). The best results were obtained when a background spectral library composed by flooded and dry soils was used. Conversely, when considering only flooded or dry backgrounds high errors were found during the initial development stages of the plants (during tillering and before panicle formation), as the reflectance of the background constitutes a significant component of the overall spectral signature recorded by the sensor. Thus, a correct characterization of the soil conditions in the training database is mandatory in order to obtain time series of realistic LAI estimates and avoid meaningless values in the first rice growing stages. Fig. 14 shows a typical example of the SPOT5 LAI evolution retrieved for a rice pixel in a representative Italian and Spanish ESUs. They reflect a common situation in which the soil was dry until DOY 132 (Italian area) and DOY 130 (Spanish area), and flooded immediately before sowing and during plant development. The figure compares the retrievals taking into account both dry and flooded conditions (red line), and the influence of using only dry soils (yellow line) or flooded soils (blue line). It can be observed that the use of a complete and representative spectral database (flooded + dry background) in the PROSAIL simulation provides realistic LAI evolutions (red line) which agree well with field measurements. It should be noted that in this situation LAI remains close to 0 before the rice emergence (from DOY 100 to 150 approx.), irrespectively of the flooded/non-flooded condition of the soil background. In a scenario where the training database does not characterize the flooded conditions (yellow line), the retrieved LAI is unrealistically high after the flooding during the first crop development stages, but as LAI increases canopy closure minimizes the importance of spectral background in the retrieval. The prediction uncertainty reflects this unreliability of the estimates, producing very high values (about 3) in the first crop stages . Similarly, when the training database does not characterize the dry conditions (blue line), the retrieved LAI is also unrealistically high before the real flooding of the fields.
In conclusion, the unrealistic effects produced by wrong spectra background can be very critical when time series analysis of rice crop dynamics is conducted. The increase of LAI at the beginning of the season would determine an unrealistic simulation of crop growth and identification of a wrong sowing date/emergence period respectively. This assessment demonstrates the importance of training correctly the RTM to produce a reliable data set for the retrieval process. In addition, retrieval of prediction uncertainty together with the LAI map can be useful to automatically identify non-agronomical areas and targets for which the model was not calibrated and/or performed well.

Conclusions
In this study, a fully operational chain for deriving high-resolution LAI maps over Mediterranean rice crops is presented. The main novelties of this work include the following : 1) the use of sound machine learning algorithms trained on simulated RTM data specifically generated to characterize rice cropping features, 2) LAI estimates were originally validated with measurements acquired by a smartphone app (at affordable cost both in time, post-processing and human resources), 3) spatially and time-resolved estimations of LAI (not just static estimates) were produced, 4) different areas with particularly different rice crop specificities and spectral background conditions related to water management were analyzed, 5) seasonal remote sensing data from two multispectral sensors were exploited allowing to simulate the future potential of the operational chain with the Sentinel-2 constellation, 6) band ranking for rice monitoring where vegetation and water features make the problem more challenging was studied, and 7) the estimation of prediction uncertainties in a temporal manner were derived from the Gaussian process model and their validity and consistency were assessed. The proposed approach was illustrated in two rice study areas characterized by different rice varieties and agro-practices.
The production of the multitemporal local LAI maps was based on the PROSAIL RTM inversion with Gaussian process regression and Landsat/SPOT5 surface reflectance data. GPR proved to be a highly efficient and accurate method to invert the PROSAIL model, outperforming NN and KRR. However, the flexibility of the processing chain allows its application to any other regression method of choice. Theoretical performances of the regressions were satisfactory, and a deeper analysis of the GPR-Landsat and GPR-SPOT5 training process showed that both models identified the green and near infrared channels as the most relevant bands in the retrieval process.
The multitemporal HR LAI maps captured the range and the temporal evolution of rice growth and resulted in agreement with corresponding in situ LAI measurements obtained from PocketLAI in two different countries. In addition, a comparison of the multitemporal estimates provided by both GPR-Landsat and GPR-SPOT5 retrievals showed good temporal consistency between them. However, a slight overestimation of the GPR-SPOT5 was observed during the period of the rice development.
In the development of the operational chain, the only userdemanding requirement refers to the provision of physical prior knowledge: site-specific characteristics need to be introduced in the model parameterization, such as rice plant and soil characteristics of the interested study area. Results highlight how a correct characterization of the underlying rice background is needed to obtain realistic LAI estimates during the initial development stages of the plants. The use of a training database comprising flooded and dry soil signatures showed to be robust against changes in background condition related to water management.
GPR retrievals also provided associated uncertainty for the predictions as a very valuable side information. This uncertainty is helpful a) for the scientist as a product to understand where potential errors in the retrieval exist, and b) for expert users, such as crop modelers, in order to weight LAI estimates in the assimilation process according to their error/goodness. The prediction uncertainty can also be used for diagnosing the presence of surfaces not addressed in the simulated database. Therefore, inspection of the uncertainty provided with insights to (1) refine the selection of PROSAIL inputs, such as certain backgrounds not initially included in the preliminary retrieval, and (2) identify non-rice pixels thus identifying possible errors in the rice mask.
The near-real time production of HR LAI maps is useful in planning the management practices (i.e. fertilization) to minimize the yield pattern variability within each parcel. This information is particularly important for precision farming activities, since farmers are expecting to be supported in prescription map production for topdress fertilization. A dense temporal data set of LAI maps is also fundamental to perform expert crop monitoring and/or improve crop model estimations exploiting assimilation techniques. It is important to mention that, whenever no external information able to represent the real variability in the field is provided for crop modeling, model simulation will provide the same results. This is the case when the aim is to apply crop models in an operational way at a parcel level. It is in fact not possible to obtain micro-meteorological information able to provide information changing from field to field. Moreover, it is not realistic to have detailed soil maps that usually exist at regional level with a scale ranging from 1:25.000 to 1:100.000. In this case, if the sowing date and variety are the same, or slightly different, the only way to capture the real spatio-temporal changes in crop development and production is to assimilate exogenous observation of crop status such as the information provided by EO LAI maps.
These results demonstrate the consistency and robustness of the presented processing chain for rice monitoring, which can be considered satisfactory for the production of HR LAI maps, and suitable to be assimilated by crop models. The proposal is aimed to improve crop monitoring and is specially suited for precision agriculture applications. Future work will include the use of the Sentinel-2 constellation whose spectral and temporal characteristics will make possible to increase the temporal resolution of the LAI estimates due to the combination with Landsat imagery, leading to further high-resolution multi-sensor studies in the framework of crop monitor and management specially in near real time. The presented processing chain has been operational during the 2016 ERMES rice season.