Improving water bodies detection from Sentinel-1 in South Africa using drainage and terrain data

In areas with extensive, nomadic, or transhumant livestock farming, it is important to access regular information on the location of ephemeral surface water bodies. Existing near-real time methods for high-resolution surface water mapping are mainly based on the use of optical satellite imagery. However, the use of optical data restricts the water detection to cloud-free conditions. To overcome this limitation SAR data are used for water bodies mapping. Nevertheless, the implemented techniques are usually not fully automated or are not applicable in hilly landscapes. Indeed, surface roughness, hill shadows, and presence of vegetation are known to affect the backscatter and lead to false alarms. In this study, a SAR-based method was used to map surface water from a set of Sentinel-1 images using the Otsu Valley Emphasis method to automatically detect a threshold for water in the histogram of backscatter. In order to reduce the false alarm rate in the steep areas, five different water masks using terrain and drainage information with different thresholds are compared in the mountainous province of KwaZulu-Natal (KZN) in South-Africa. The quantitative assessment shows that the overall accuracy ranged between 0.865 and 0.958 with the highest value obtained with the HAND (Height Above the Nearest Drainage)-based mask with a threshold of 10m. This mask also minimized the false detection of water with the lowest specificity of 0.037. The visual inspection over two reservoirs (Midmar Dam and Wagendrift Dam) shows that there is high agreement between the produced map and the reference data despite differences in their spatial and temporal coverage. Besides, radiometrically terrain corrected SAR data, which could be advantageous in such landscapes were recently made available by the ASF vertex platform. Even though they are not available in NRT, the potential of using such data for water detection is investigated.


INTRODUCTION
There are permanent surface water bodies and ephemeral ones whose occurrence fluctuates throughout time. Monitoring surface water extent in order to provide reliable and timely information to farmers and livestock producers is crucial when the availability of surface water resources is irregular and influenced by frequent dry periods. According to FAO 1 drought events account for 85.8% livestock losses in South-Africa. Focusing on the 2015-2016 drought in Kwazulu-Natal, South-Africa, Vetter et al. 2 identified that livestock mortality was concentrated on a short period after water and forage are too scarce and far apart.
Optical and radar satellite data are regularly used to map surface water from space [3][4][5] . Sentinel-1 SAR data have the advantage of providing images every six days over South Africa which could be used to extract water even under cloudy conditions. Water detection techniques like classification and machine learning score high accuracies after training with reference data 3,4,6,7 . Simpler methods which differentiate water from land and vegetation in the histogram of SAR backscatter have also been applied 8,9 . In mountainous landscapes water detection methods are less performant due to slope orientation and hill shadows, as well as surface roughness, presence of vegetation. Therefore, the use of water masks based on elevation, slope or drainage information have been proposed in the literature 7,8,10,11 . Their efficiency is linked to the specific topography of the study area. Besides, SAR Analysis-Ready Data (ARD) data are getting more attention after the CEOS ARD for Land (CARD4L-http://ceos.org/ard/) specifications for SAR data have been produced.
In this study, we present the application of an unsupervised method to automatically retrieve water pixels from a series of Sentinel-1 images, then compare several terrain and drainage-based masks in order to improve the accuracy of the resulting water maps in South Africa. Additionally, the potential use of Radiometrically Terrain-Corrected data, produced on-demand as ARD through the ASF vertex platform has been explored.

STUDY AREA
The study area is located in KwaZulu-Natal (KZN) province in the east of South Africa (see Figure 1). The KZN province is bordered by the Indian ocean to the east, Lesotho to the west, Swaziland and Mozambique to the north and the Eastern Cape province to the south. The province hosts 25% of South African agricultural activities. The sources of water for crop production in KZN range from municipal water supply (3.2% of farms), river (23%), dam (33.1%), water boards/schemes (5.2%), groundwater (14.4%), both surface and groundwater (2.8%), other headwater and wetlands (29%) 12 . Livestock production constitutes an important economic activity. For livestock production the livestock system used is the grazing system (58.7% of farms), mixed system of farming (25.1%), industrial (2.9%). In 13.3% of farms no livestock system is applied. The study was limited to an area of interest (AOI) defined by the intersection of a series of Sentinel-1 footprints. The AOI is characterized by a complex topography with elevation going from sea-level up to 3314m at the escarpment formation Drakensberg, the highest in South Africa (see Figure 1). The area is part of Mzimvubu-Tsitsikamma and Pongola-Mtamvuna water management areas that consists of catchments such as Mbhashe, Mthatha, Mzimvubu, Mtamvuna, Mzimkhulu, Mkomazi, and Mngeni. and main river systems such as Mbhashe River, Mthatha River, Mzimvubu River and Mngeni River.

Terrain and drainage data
The Shuttle Radar Topography Mission (SRTM v.4) Digital Elevation Model (DEM) was used with a 90m spatial resolution as provided by https://srtm.csi.cgiar.org/. A slope layer (in degrees) was derived from the DEM using Geographical Information System (GIS) software. The Height Above the Nearest Drainage (HAND) index was also used to normalize the topography in respect to the drainage network 13 . The dataset was derived through the Google Earth Engine asset 'users/gena/global-hand/hand-100' 14 . All datasets were clipped to the extent of the AOI, resampled to 10m, and reprojected to UTM Zone 34S using GIS software.

SAR data
Sentinel-1A and 1B images over the KZN province were downloaded for the period June to August 2018. The dates for which images were available are: 01, 10, 13, 22, and 25 for June, 04, 07, 16, 19, 28, and 31 for July, and 09, 12, 21, and 24 for August. All 15 images were of Ground Range Detection (GRD) type and in Interferometric Wide (IW) swath instrument mode. Only the Vertical-Vertical (VV) polarized bands were pre-processed 9, 11 using the following steps: orbit application, thermal and GRD border noise removal, calibration, terrain correction, speckle filtering, and conversion to backscattering sigma0 coefficient 11 . The download and pre-processing steps using the ESA SNAP Graph Processing Tool (GPT) are described in 9 .
For the same area, a SAR ARD image has been downloaded from the website of the ASF vertex group (https://www.asf.alaska.edu) for the date of 30/6/2017. The image was made available on-demand Radiometrically Terrain-Corrected (RTC), where gamma naught backscatter (gamma0) is provided instead of the sigma0 backscatter coefficient 15 . RTC data adjust radiometric returns to appropriate surface area, using the SRTM DEM at 30m (1sec). The data are provided either with or without speckle filtering. In the present case the Enhanced Lee filter 7x7 was used.

Reference data
The National Water Body (NWB) layer was available in vector format covering the whole country 5 . The shapefile that delineates the water bodies' extent for the winter period (June to August 2018) was derived from a spatial composite of cloud-free Landsat-8 Operational Land Imager (OLI) images. The polygons' layer was dissolved and clipped to the AOI using ArcGIS software. This dataset was used for the validation of the produced final water maps due to absence of higher resolution reference dataset or ground-truth data. In addition, reference extents of a few dams were available in vector format for the year 2017 based on SPOT6 images and used for the evaluation for the RTC image.

Water detection
To detect the water pixels from the SAR backscatter VV image, the Otsu Valley Emphasis histogram thresholding method was used as described by Ovakoglou et al. 9 . A threshold is automatically found in each bimodal histogram for each Sentinel-1 image without the need for reference data or supervision. In case of failure of the algorithm, due to a rather unimodal-like distribution resulting from the prevalence of mixed pixels with varying percentage of land, water, and vegetation, the threshold found for the most recent image is used. A failure was detected with an empirical upper bound value of -10dB for the threshold. All 15 images were combined into a final water map for the full winter period of 2018. The automated implementation of this method for the processing of all images was possible using python scripts and libraries.

Terrain and drainage masking
In order to improve the water detection, by excluding water look-alikes (shadows in steep terrain, snow) and areas where surface water is unlikely to accumulate, water pixels were filtered using five masks. These masks are: -HAND_10: mask excluding pixels where HAND index is above a threshold value of 10m. This value was set empirically.
-HAND_15: mask excluding pixels where HAND index is above a threshold value of 15m 7, 11, 13 . -Terrain_1: mask excluding pixels where elevation is above 2000m and slope is above 15°. These values were set empirically.
-Terrain_2: mask excluding pixels where elevation is above 1400m and slope is above 15° (This slope threshold was used by Martinis et al. 10 ). -Terrain_3: mask excluding pixels where elevation is above 1400m and slope is above 8° 8 .

Evaluation of RTC images
The Otsu Valley emphasis algorithm was run using the VV gamma0 layer from an RTC ARD image of 30/06/2017 and compared to the water detection result obtained using a VV sigma0 layer from the same image. In the former case, the output data have a spatial resolution of 30m which is determined by the DEM data used by the ASF algorithm during the radiometric terrain flattening process.

RESULTS
By applying the Otsu Valley Emphasis histogram thresholding method, water thresholds for 10 backscatter images ranged between -25dB and -20.5dB. For five images, the algorithm failed to detect the valley in the VV backscatter histogram, providing threshold values between -1.82 and 8.7; thus, confirming the suitability of the upper bound value of -10dB. In these cases, the threshold from the preceding Sentinel-1 image date was used. Result of threshold detection using the Otsu Valley method are shown in Figure 2   To facilitate comparison with reference data (NWB layer), all water detections for the whole period were merged using the Union function in a GIS environment. The results for accuracy assessment are presented in Table 1 for the five masks and when no mask is used. The overall accuracy ranges between 0.866 and 0.958. The highest value was obtained with the HAND_10 mask with a highest kappa value of 0.531. The recall value was slightly lower using the HAND_10 mask (0.817) than all other maps which had a similar recall rate of around 0.84. This means that there were slightly more excluded water pixels when using HAND_10. However, the false positives were few as shown by the highest precision value (0.415) compared to the other masks used (HAND_15, Terrain_1, Terrain_2, Terrain_3) or when no mask was used. False positives should be minimized in the context of livestock management as erroneous identifications of water bodies could mislead the planning of livestock transhumance. Using the DEM_1 mask, the altitude threshold was set high, i.e., 2000m, leading to the highest recall value (0.845) but this led to a much lower precision (0.179) and overall accuracy (0.882). Examples of surface water detections using the HAND_10 mask for two dam reservoirs of different sizes are presented in Figure 3. Considering that the maps are seasonal and that the reference dataset results from the processing of satellite images, discrepancies from the NWB reference data were expected. Figure 3 illustrates the agreement of the resulting water extent maps with the NWB data, with the entire water bodies clearly mapped in both cases. A slightly wider extent is detected south of the Midmar Dam reservoir (false positives). Several sparse missed detections and false positives are also present around the same dam. The resulting sparse false positives can actually be explained by the difference of resolutions between compared datasets (i.e., 10m for SAR-based water extent and 30m for NWB) which might have failed to identify small or ephemeral water bodies. As for the RTC image, as the data selected is prior to March 2018, the removal of the border noise on the side of the full tile was of concern on the VV layer. More noise was removed on the RTC gamma0 image compared to the sigma0 backscatter one. The output resolution of the gamma0 layer is 30m instead of 10m for the non-RTC image, due to the resolution of the DEM used by the ASF algorithm (SRTM 1sec). Some no data pixels are present in the RTC backscatter VV image caused by the shadow of the slopy side as shown in white in Figure 4. Figure 4 presents the backscatter for a water reservoir (Nagl dam) located in the center of the image and for a plateau located south-west of it. The no data pixels appearing on the RTC image are a serious disadvantage as they could lead to missed water detections 15 .  . Water maps obtained using SAR images with and without RTC correction (detected water in white, absence of water in black, in blue the reference water body around the Albert falls dam). Left: Water map without RTC correction but with speckle filtering and HAND masking. Right: Water map with RTC correction with speckle filtering (7x7) but no HAND masking.
The histogram thresholding works better on the RTC image with a more distinct valley between the water and non-water pixels with a more accurate detection of water in the Albert falls dam as shown in Figure 5, where the reference extent of the dam was delineated based on a SPOT6 image. Nevertheless, based on Figure 5, the extent of the detected WB is of similar shape with both datasets despite the missed detections across the water area. Besides, due to the 30m spatial resolution of the ASF RTC data, rivers and small WB are more difficult to detect than on the non corrected map which present a higher resolution.

CONCLUSION
An automated unsupervised methodology was presented in this paper to detect water from a time series of Sentinel-1 SAR backscatter images in near-real-time using the Otsu Valley Emphasis histogram thresholding method. Five masks based on terrain and drainage information were applied and compared in order to improve the water bodies extraction in the province of KwaZulu-Natal in South Africa. The results obtained show high agreement with the South African NWB layer used as reference data. The spatial resolution of the water maps was increased from 30m to 10m. The use of the drainage-based mask where HAND values are less than 10m provided the highest detection rate over the whole winter period. Considering the importance of livestock farming activity in the KZN province for which the availability of information on surface water is crucial, it is considered more relevant to miss few water pixels rather than providing false positives; thus, the use of the HAND_10 mask is recommended. The use of SAR radiometrically terrain corrected data, while leading to a better detection of the water threshold in the histogram, also leads to additional no data pixels and a lower resolution of the water maps. Further investigation will be performed with self-analysed/produced RTC data rather than ARD images, using DEM data of improved quality compared to those used by ASF during the terrain flattening process.