Alert System for Algae Bloom Detection in Inland Waters of Latin America: An Ongoing Project

As part of a collaborative effort among researchers of several institutions and organizations, this project takes advantage of the Google Earth Engine (GEE) cloud computing environment to map algae bloom over the main water bodies and reservoirs of Latin America using Sentinel-2 imagery (2015 to present). The methodology based on the Normalized Difference Chlorophyll Index (NDCI) for chlorophyll-a and Trophic State Index (TSI) detection provided promising results. NDCI responds well to high levels of chlorophyll-a and, therefore, can be used as an indicator for algae blooms. The image processing as well as the display of maps and charts are being implemented into a GEE App to be freely available for general public use.


INTRODUCTION
Algal blooms (ABs) occur when certain kinds of algae grow very quickly, forming patches, or "blooms", in the water. These blooms can be used as indicators of water degradation and some of them can release powerful toxins with potential for endangering human and animal health. Moreover, they sometimes can cause an awful smell that requires more costly treatment for public water supplies, having also a strong impact on sanitation and hygiene [1]- [3]. They also impair the recreational potential os many water bodies affecting the tourism economy in various ways.
Considering that one of the United Nations Sustainable Development Goals (SDG) is to ensure the quality of water and sanitation for all, and knowing the deleterious impact of AB on water quality degradation caused by AB is key for its achievement. AB events can be monitored from satellite and an operational system could send alerts to stakeholders to make them aware of of the risk of hazardous species. In this sense, monitoring the trophic state of water bodies is mandatory since eutrophication is one of the main pollution problems of inland and coastal waters at a global level [4]. For example, AB maps can be used for different applications ranging from urban water supply to fisheries production.
For that reason, a few institutions around the globe are developing a tool to monitor the occurrence and extent of AB over time [5]- [8]. This information can support health officials, environmental managers, and water treatment facility operators to assure water quality (WQ) for multiple uses.
Most of the effort towards AB monitoring has been done in North America and Europe, while in Latin America (LA) such tool has not been developed yet. A series of factors explain this lack of information on LA such as incomplete database to validate water quality products, and limited number of research groups for pre-processing satellite images necessary to generate information on an operational basis. To address this gap, a research project has been approved in an international collaboration (Google Earth Engine and EO Datascience [9]) to build an App that provides near-real time water quality information. The technical strategy is to use the Normalized Difference Chlorophyll-a Index (NDCI) [10], derived from corrected Sentinel-2 (S2) imagery, as the main parameter enabling the spatial and temporal comparison amongst water bodies from different water basins. With the aid of in situ data shared by the project's participants, NDCI will be converted into information such as AB occurrence, Trophic State Index, and chlorophyll-a concentration.

OBJECTIVES
With the collaboration of several institutions and organizations, this project proposes the use of Google Earth Engine (GEE) to map AB over the main water bodies and reservoirs in LA using Sentinel-2 imagery (2015 to present). The main objective is to provide updated information about the state of water quality degradation (eutrophication level) LA water bodies. Specific objectives are: i. Build an atmospheric corrected NDCI collection using cloud computing (GEE); ii. Build a GEE App to provide near-real time WQ information and spatial stats, graphs, download options; iii. Capacity building amongst participants and stakeholder's engagement.

METHODS
We intend to apply the methodology to areas (pilot sites) where in situ data is available for validation purposes. Once the approach is developed and validated in the pilot sites, the method will be transferred to stakeholders at several levels to provide AB information for main water bodies/reservoirs in Latin America.

Image processing
GEE provide a large collection of satellite imagery, including Sentinel-2 (S2), and the current collection with atmospheric correction is COPERNICUS/S2_SR, corrected with the Sen2Cor algorithm, not designed for water application, thefore far from optimal [11]. This fact rises issues regarding the application of Sen2Cor products: first, the undercorrection of red-edge band (B5, 705 nm) which shows consistently higher values when compared with in situ radiometric measurements; and the second is the time-series window, which is currently limited to imagery acquisition from December 2018. Alternatively, we propose the use of SIAC (Satellite Invariant Atmospheric Correction) implemented in GEE [12], which has shown satisfactory Level-2 (Surface Reflectance) results. The atmospheric corrected S2 imagery will be corrected for sun-glint effects, and then masked for clouds and continental land, followed by NDCI computation [10].
Because the SIAC correction demands high computational processing on GEE it prevents the display of the results online. Alternatively, a NDCI collection will be created in which a single image represents a 5-day NDCI mosaic of the entire LA and stored in the GEE asset. From 2015 to 2020, more than 300 mosaics will be stored summing up to, approximately, 15GB.

Model's Calibration and Validation
To retrieve specific WQ information from NDCI collection, the first step is to calibrate and validate the models for chl-a, TSI, and AB detection using the in situ database (Figure 1). This process will be, initially, carried out using a particular database from a water basin or a reservoir. As an ilustration Tietê River Basin (São Paulo, Brazil) is used as an example. The in situ data was provided by CETESB, an official environmental agency in the state of São Paulo [13]. This specific application is the subject of a research article in preparation. Using NDCI, a power-law fitting estimates chla with a R 2 of 0.78 (N=186), and with a decision-tree, one can classify TSI into five classes (Oligo, Meso, Eutrophic, Super and Hypereutrophic) with an overall accuracy of 75%. Fig. 1. Schematic illustration of the methodology: Image processing flowed by calibration/validation of predictive models to display results such as chl-a, NDCI, TSI and time-series plots.

Google Engine APP (Products)
Once the models are validated they are implemented as GEE App so as users who are not familiar with remote sensing can access water quality information derived from satellite imagery. App's users can configure functionalities such as location, date range, and products (graphs and spatial stats) to display and/or download to their computer. The intention with this App is to reach a wider public ranging from water managers, agencies, municipalities, and the general public.

PRELIMINARY RESULTS
The first results for the Tietê River Basin present a Sentinel-2 image acquired on Feb 21 st , 2017. After atmospheric correction and masking, NDCI was computed to derive chl-a and TSI maps (Lobo et al. in prep.). The user will be able to choose any date from the image collection. Figure 2 illustrates Guarapiranga and Billings reservoirs located in the Metropolitan Region of São Paulo City where high NDCI and chl-a (Chlorophyll-a) values are observed. As a result, most of the water surface is classified as Eutrophic or higher levels.  Besides displaying RGB, NDCI, and water quality information, the user will be able to generate a time-series plot ( Figure 3) from any pixel of the Billings Reservoir. Time-series analysis can help the user to identify trends and temporal patterns that are key for proper water quality monitoring. For example, in Figure 3 one can notice higher NDCI values during summertime (December to February in the southern hemisphere) when compared to winter periods (June to August). Some of these analytical features are being implemented on GEE App (Figure 4, next page), including choosing the location, date, and average chl-a concentration of a given period.
Regarding the project team capacity building. Project's partners (EO DataScience) has already offered 3 online courses on Google Earth Engine (Fundamentals, Machine Learning and App) for the project's participants.

CONCLUSIONS & NEXT STEPS
Considering the successful application of NDVI (Normalized Difference Vegetation Index) in several land applications, here NDCI is proposed as a straightforward tool for chlorophyll-a, and TSI detection. NDCI responds well to high levels of chlorophyll-a and, therefore, can be used as an indicator of algae blooms. The image processing, as well as, the display of maps and charts are being implemented into a GEE App to be freely available for general public use.
In parallel to the App development, the following steps include the building of a Latin America data base for water quality information. We expect to gather WQ data from several Latin American countries such as Uruguay, Argentina, Colombia, Costa Rica, and Brazil. With this information, site-specific models will be able to be calibrated/validated. Perhaps, a fit-all model could be applied to retrieve chl-a and TSI in order to allow a large scale spatialtemporal analysis in LA. For example, Figure 5 depicts the average NDCI for January 2020 from Brasília (Brazil) to Buenos Aires (Argentina) demonstrating the potential of cloud computing for large scale analysis.