Identification of Rice Fields in the Lombardy Region of Italy Based on Time Series of Sentinel-1 Data

Probably a consequence of the unbalance between rice production in Asia and in Europe, satellite-based rice identification in Asia is widely discussed in scientific literature whereas SAR-based mapping of European rice paddy field has received less attention so far. In this paper, we propose a simple methodology for identifying European rice paddy fields from time series of SAR data. Standard practices for management of water in conventional European rice paddy fields translates into a distinctive pattern of low backscatter values between April and May, typically preceded and followed by higher backscatter values due to ploughing and emergence. Our proposed method leverages such pattern to discriminate rice against other crops and in a test involving the entire Italian rice-producing region of Lombardy has achieved very good Overall Accuracy (OA) scores. This paper reports the method, our test results and draws some preliminary conclusions.


INTRODUCTION
Rice mapping based on spaceborne remote sensing has been widely investigated in Southern Asian contexts, where most of global rice production takes place. Cultivation practices in such regions are however different from European ones for cultural, environmental and climatic reasons; simple reuse of those methods [1][2][3] will not lead to high levels of accuracy. This situation calls for specific research on space-based mapping of European rice, with an outlook to extracting information that may be useful for example to enhance food traceability by providing third-party (satellite-based) information on development stages, or on cross-checking compliance with production specifications. Spaceborne remote sensing data can indeed help monitoring crops at a large scale, providing precise and timely information on the phenological status and development of vegetation [4][5][6][7], detecting possible emerging problems, and also building independent records of the crop status. This may pave the way towards cross-checking of organic claims [8].

METHODOLOGY
Standard rice cultivation practices in Europe require flooding of paddy fields between April and May, a feature which is very specific and not generally encountered in other types of crops. Rice fields tend to display low radar backscatter values during this period compared to the rest of the year, due to the water surface covering the area. This peculiar situation can be leveraged to develop a simple yet effective rice identification strategy. As visible in Figure 1, mirror reflection from flooded fields does actually translate into significantly lower backscatter intensity in mid-to-late spring, a trend which is not observed in backscatter series from other land cover classes. This suggests that a simple way to discriminate between rice paddy fields and other land cover classes could hinge on the identification of a clear dip in the reflectivity trend at a designated time of year. The experiments reported in this paper intend to test the feasibility of rice identification based on such feature, and to propose sensible values for classifi-cation parameters like width, depth and temporal location of the reflectivity dip. Tests were carried out on a statistically significant area as described in the next section.

Lombardy and rice production
Among all EU countries, Italy is the biggest rice producer, covering more than 53% of the entire European rice-growing area and exporting more than 45% in weight of its domestic output. In North-Western Italy, the province of Pavia in the Lombardy region ( Figure 2) accounts for just above one third of the total domestic production [9] thanks to its 82,000 hectares of rice paddy fields. Lombardy was thus selected as the area of interest.

The regional land cover map
Reference data on land cover over the Lombardy region of Italy were extracted from the geographic database named DUSAF 6.0 ("Destinazione d'uso dei suoli agricoli e forestali", 6 th version), referring to year 2018 and developed by Lombardy region using AGEA ortophotos and SPOT 6/7 satellite images, publicly available on the web Geoportal of Regione Lombardia [10].
A series of preliminary operations were carried out to adapt the database to the present study. These refined data were then split into two groups: the first one includes data of the administrative Provinces of Pavia, Milan and Lodi (PML), the largest producers of rice in the region; the second group consists of all agricultural fields in the region (AllAgr). This generates two sub-sets that are a bit better balanced, in terms of the ratio between the number of rice (R) fields and of non-rice (NR) fields, than the starting set. Plus, the two subsets account for 160,000 distinct polygons, which slashed the processing times with respect to the full set of 308,000 polygons.

Spaceborne data and processing tools
Cloud cover and haze are very frequent in Northern Italy in spring, when rice paddy flooding takes place. This makes spaceborne radar sensors a more reliable source of information than optical sensors.In terms of processing environment, Google Earth Engine (GEE) was selected because it offers easier access to data, processing tools and resources, than other options considered.
The C-band Sentinel-1 SAR constellation was selected as the radar data source, because of its open policy, and its features compatible with the intended application. In this work we used the standard IW acquisition mode, and the VH polarization because it offers greater sensitivity to the presence of vegetation emerging from the flooded fields. The reference year is 2018 and a time series of 60 radar acquisitions was considered for the analysis. The Sentinel-1 image collection available on GEE has already undergone a series of pre-processing steps leading to the final normalized backscatter coefficient (σ 0 ), ready to use.

TEST AND TUNING
On each polygon of the reference dataset, the spatiallyaveraged backscatter value was computed for every satellite acquistion, approximately every six days. The series of values was then smoothed out using a Savitzky-Golay (S-G) filter, which makes the envelope to emerge by suppressing most speckle-driven local jitter, while at the same time preserving relevant changes in time. This is of critical importance in the decision tree-based identification method proposed in this paper. Multiple tests led us to set a window length of 11, and filter order of 2 as suitable parameter values for the S-G filter.
Starting from the smoothed series, we defined our first classification approach, derived from the previous discussion. Simply, the date of the absolute minimum value is identified, and if this date falls between April 1st and June 30th, the polygon is tagged as "Rice", whereas if it doesn't, the polygon is tagged as "Non-Rice". Since S-G filtering removed most of jitter, this simple criterion was sufficient to achieve a fair overall accuracy. Further research suggested the average value of the time series calculated over the entire year, and the absolute minimum value of the time series could provide additional, actionable information to improve classification results. An in-depth analysis of the rice field reference data has shown how these two parameters feature two well-defined ranges of typical values for rice paddy fields. As can be noted from Figure 4, minima and average values tend to fall in most cases around -22 dB and -18 dB respectively. Fig. 3: Example of the implementation of the Savitzky-Golay filter on a time series extracted from a rice field. By suppressing the local jitters, this processing step makes the spring plunge in reflectivity well identifiable.
Using this information we could improve our first classifier by imposing two ranges based on these allowed values, keeping the algorithm simple with little expense in terms of omission errors, while reducing the number of false positives. Therefore, the condition on the date of the minimum is no longer sufficient to classify a polygon as "Rice", and additional conditions were sought. Looking at statistical values for the different classes reported in Table 1, it becomes apparent that typical figures for minimum and yearly average are significantly different between rice and other land cover classes. This offered an additional, effective condition for identifying rice: the absolute minimum value must fall between -29 and -17 dB, and its average value must fall between -21 and -15 dB. This improvement proved to be more effective on the PML set, demonstrating that agricultural fields tend to be less separable than urban areas from rice paddy fields with this technique. Another remarkable point from Table 1 is that rice is more homogeneous and consistent that the other classes, since its standard deviations are lower both on minima and on yearly averages. Table 2 reports the accuracy results measured during the tests described in the previous section, broken down into different areas and different thematic components. On the main rows, the different geographic areas are reported. For each main row, two sub-rows report two separate discrimination experiments, i.e. rice against all other agricultural land cover classes on the upper half-row, and rice against all other general land cover classes on the lower half-row. It can be noted that accuracy values are always lower when classification is done against other agricultural land cover classes than in the general case. The correspondingk values are not significant and have not been reported. It is indeed known that the poly-  gon population is heavily unbalanced against rice, especially in the entire regional set; the overall accuracy is still deemed sufficient to guide the development of the algorithm and the tuning of its parameters. As mentioned in section 1, little previous research was done on this specific land cover discrimination problem; however, to let the reader get a sense of how rice mapping performs in other contexts, the methods proposed in [3] report Overall Accuracy levels falling between 80 and 90% in China.

CONCLUSIONS
In this paper, a simple yet effective method to identify rice fields from time series of spaceborne SAR backscatter intensity values has been proposed. The method focuses on rice  It can be noted that rice is more subject to be discerned against the general land cover class set than it is against other agricultural fields.

Fig. 5:
Flowchart summarizing the proposed algorithm. Condition on min location requires the absolute minimum to fall between April 1st and June 30th. Condition on min, avg value requires the absolute minimum value to fall between -29 and -17 dB, and the average value to fall between -21 and -15 dB paddy fields in Europe, where the strong seasonality of rice growth cycle offers a means to identify them through analysis of a few temporal features. The typical dip in reflectivity associated with field flooding in April is characterized by well-defined ranges of values for a pair of relevant parameters. Specific tests assessed discriminability of rice paddy fields from the general set of land cover classes, and from other agricultural land cover classes. The proposed rule-based classifier performs well in the general case, less so when comparing rice with other agricultural fields. This may be related to the presence of farmlands that share some similarities with rice fields. Future research lines include investigation on additional shape parameters describing the spring minima, and incorporation of polarimetric information where available from Sentinel-1, and possible incorporation of crowdsourced information to build enhanced ground reference data [11].

ACKNOWLEDGMENTS
This work was partly funded by the European Commission through the H2020 MSCA RISE Project EOXPOSURE, Grant Agreement #734541.