Deep-Sea Modelling the Mediterranean pelagic ecosystem using the POSEIDON ecological model. Part I: Nutrients and chlorophyll-a dynamics

Ecosystem models are recognized as valuable tools to understand marine ecosystem dynamics and address management questions, requiring an integrated approach of physical and biotic processes. In the present study, a three-dimensional basin scale Mediterranean hydrodynamic/biogeochemical model, currently operational as part of the POSEIDON forecasting system, was extensively validated against available in-situ and satellite data. The generic POSEIDON ecosystem model skill was objectively assessed, in capturing the observed seasonal variability of the main chemical ecosystem components within the planktonic food web. A cluster K-means analysis was applied to obtain an objective ‘biogeography’ of the Mediterranean Sea, allowing a more efficient validation of model outputs against observational data. The model presented a reasonable skill in reproducing the seasonal cycle and observed patterns of the Mediterranean inorganic nutrients and chlorophyll-a, characterized by a prominent west-to-east gradient and their increase in areas receiving lateral nutrient inputs. Despite some model deviations, validation results have demonstrated the capability of the model to represent the Mediterranean ecosystem, ranging from oligotrophic open ocean to mesotrophic conditions in productive coastal areas.


Introduction
Coupled hydrodynamic/biogeochemical models can serve as useful tools to investigate the marine ecosystem functioning, reconstruct properties that are difficult to measure with adequate resolution and describe the marine ecosystem environmental status, under various human and natural pressures. During the last 30 years various ecosystem models allowed to identify and quantify the major spatial-temporal features of the Mediterranean marine ecosystem (Crise et al., 1999;Allen et al., 2002;Crispi et al., 2002;Petihakis et al., 2002Petihakis et al., , 2014Tsiaras et al., 2010,;Lazzari et al., 2012Lazzari et al., , 2016Cossarini et al., 2015;etc.). Moreover, service providers, such as the Marine Environment Monitoring Service of the European Copernicus and the Marine Strategy Framework Directive (MSFD) now routinely utilize ocean datasets provided by models to support decision-making of marine resources use. However, particular attention is required to the validation of ecosystem models, before being used to address research and management questions related to ecosystem dynamics.
In the framework of the POSEIDON forecasting system (http:// www.poseidon.hcmr.gr/; Nittis et al., 2006;Korres et al., 2010), a biogeochemical model coupled to the Mediterranean basin scale hydrodynamic model, is operational since January 2008, providing shortterm prediction of the Mediterranean ecosystem (http://www. poseidon.hcmr.gr/ecological_forecast.php). The generic POSEIDON ecosystem model has been both tuned and validated in the Eastern Mediterranean Sea (Petihakis et al., 2002Triantafyllou et al., 2003;Tsiaras et al., 2014). A pre-operational validation/calibration of the generic POSEIDON ecosystem model was presented in Tsiaras et al. (2010) against remote sensing ocean colour data and available historical in-situ data of chlorophyll-a (Chl-a) and inorganic nutrients (MATER, CINCS, MEDAR/MEDATLAS II database). Recently, Tsiaras et al. (2017a) examined the performance of the model assimilating satellite derived Chl-a with regard to the assimilated data (Chl-a) and non-assimilated model variables, such as dissolved inorganic nutrients (nitrates, phosphates). Given the experience obtained by the widespread use of the generic POSEIDON ecosystem model in the Mediterranean Sea at both research and operational arenas (Petihakis et al., 2009(Petihakis et al., , 2002(Petihakis et al., , 2012Triantafyllou et al., 2000Triantafyllou et al., , 2003Tsiaras et al., 2010Tsiaras et al., , 2014Tsiaras et al., , 2017aTsiaras et al., , 2017bPolitikos et al., 2011;Kalaroni et al., 2016;Hatzonikolakis et al., 2017;Gkanasos et al., 2019), and the growing availability of observational data, is now mature for an in-depth model skill assessment in simulating the main components of the pelagic ecosystem (nutrients, Chl-a, phytoplankton, zooplankton bacteria, dissolved and particulate organic matter).
In this study, the model skill is evaluated by a quantitative comparison of the model results to in-situ observations of inorganic nutrients (phosphate, nitrate) and Chl-a concentrations, obtained from European databases and fixed-point observatories, as well as satellite Chl-a observations. In order to objectively analyse and validate the complex spatial-temporal outputs of the generic POSEIDON ecosystem model, a cluster analysis using the K-means algorithm (Hartigan and Wong, 1979) was carried out to classify the model domain into regions of similarity (clusters). Hence, the model skill to reproduce the basic chemical components of the Mediterranean ecosystem, such as inorganic nutrients and Chl-a, on a seasonal timescale and over an objective eco-regionalization of the basin, was assessed. The modelled biological (plankton) components are presented and validated in detail in a companion paper referred as part ΙΙ in this volume (Kalaroni et al., 2019).

The Mediterranean Sea
The Mediterranean Sea (MS) is the largest (2.5 million km 2 ) and deepest (average 1460 m, maximum 5267 m) quasi-enclosed sea on the Earth, divided by prominent geographic discontinuities into sub-basins with different biophysical and trophic characteristics. MS basin is divided in the western (WMED) and eastern (EMED) sub-basins by the shallow sill of the Sicily Strait. It is one of the most oligotrophic seas (Krom et al., 1991), characterized by an increasing west-to-east oligotrophy gradient (Turley et al., 2000, Bosc et al., 2004. The eastern MS basin is ultra-oligotrophic, with inorganic phosphorus being the most likely limiting nutrient for primary production (Thingstad and Rassoulzadegan, 1995;Thingstad et al., 2005). The oligotrophic regime is partially maintained by an anti-estuarine circulation, in which nutrient-depleted Atlantic Water (AW) flows into the MS through the Gibraltar Strait, while more saline Mediterranean water flows out at intermediate depths (200-500 m) carrying dissolved nutrients of relatively higher concentrations (Pinardi and Masetti, 2000).
Some exceptions of local mesotrophic patterns are observed in a) regions fertilized with nutrients by external water masses (i.e. inflow of AW in Alboran Sea, inflow of BSW in North Aegean Sea), b) in deep convection areas (i.e. Ligurian Sea) where the vertical transport of nutrients enriches the euphotic zone during winter, and c) in coastal areas enriched by large river outflow (i.e. Gulf of Lion, Adriatic and North Aegean Seas; Polimene et al., 2006;Siokou-Frangou et al., 2010;Macias et al., 2014Macias et al., , 2018. However, Chl-a concentration and primary production are very low in the uppermost layers over the entire annual cycle, with the exception of the short period during late winter mixing events, and can only be sustained at the top of the nutricline, where a deep chlorophyll maximum (DCM) is found (Siokou-Frangou et al., 2010). The depth of the DCM progressively increases eastwards, from 30 m in the westernmost area (Dolan et al., 2002), to 50 m in the North West (NW) basin (Estrada et al., 1993), down to 120 m in the Levantine basin (Christaki, 2001;Dolan et al., 2002). In addition, the MS is a region with significant atmospheric inputs of dissolved inorganic nutrients, originating from both Saharian dust and anthropogenic emissions that can stimulate net primary productivity, during summer oligotrophic conditions (Christodoulaki et al., 2013, Gallisai et al., 2014.

Model description
The three dimensional (3-D) hydrodynamic-biogeochemical model consists of two on-line coupled, sub-models: the 3-D hydrodynamic Princeton Ocean Model (POM; Blumberg and Mellor, 1987) and the biogeochemical model based on the European Regional Seas Ecosystem Model (ERSEM; Baretta et al., 1995).

The hydrodynamic model
POM is a primitive equation, free surface and sigma-coordinate 3-D circulation model that has been extensively described in the literature (Oey et al., 1985;Blumberg and Mellor, 1987) and is accompanied by a comprehensive user's guide (Mellor, 2002). It is a widely spread community model that has been used both for coastal and open ocean studies (Korres and Lascaratos, 2003;Zavatarelli and Pinardi, 2003;Kourafalou and Tsiaras, 2007, among others).
The model prognostic variables are temperature, salinity, velocity, sea surface height and turbulent kinetic energy. The model uses a bottom-following sigma coordinate system and an Arakawa-C staggered grid in the horizontal. The vertical eddy viscosity and diffusivity coefficients are computed using the Mellor-Yamada 2.5 turbulence closure scheme (Mellor and Yamada, 1982) that solves the equations for turbulent kinetic energy and turbulence macroscale, considering the wind stirring and the stability induced by stratification. Horizontal diffusion is calculated along sigma-levels following a Smagorinsky formulation (Smagorinsky, 1963). Time integration is performed with a split (internal: 1200 s, external: 1200/90 s) time step, in which the barotropic and baroclinic modes are integrated separately with a leapfrog time differencing scheme. A second order conservative upstream advection scheme (Lin et al., 1994) is used for the biological tracers, while a second order centred advection scheme is used for all other tracers.

The biogeochemical model
ERSEM is a generic comprehensive community model that has been successfully implemented across a wide range of coastal and open ocean ecosystems, such as the North Sea (Radach and Lenhart, 1995), the Mediterranean Sea Petihakis et al., 2002, 2017a and the Arabian Sea , among others.
The model uses a 'functional' group approach, where biotic groups are distinguished not by species but by their functional role (producers, consumers and decomposers) in the ecosystem, using size as the major characteristic. The pelagic food web has been modified by Petihakis et al. (2002) from the original ERSEM, to better represent the microbial loop that is particularly important in the Mediterranean. The pelagic state variables include 4 phytoplankton groups: diatoms (20-200 μm, silicate consumers), nanophytoplankton (2-20 μm), picophytoplankton (< 2 μm) and dinoflagellates (20-200 μm), heterotrophic bacteria and 3 zooplankton groups: heterotrophic nanoflagellates (feeding on bacteria, picophytoplankton and nanophytoplankton), microzooplankton (feeding on nanophytoplankton, heterotrophic nanoflagellates, diatoms and dinoflagellates) and mesozooplankton (feeding on diatoms, dinoflagellates, microzooplankton and heterotrophic nanoflagellates). The pelagic food web model includes also particulate and dissolved organic matter (produced by the mortality, excretion and lysis of primary and secondary producers and utilised by bacteria), along with dissolved inorganic nutrients (nitrate, ammonium, phosphate, silicate). The chemical dynamics of nitrogen, phosphorus, silicate and oxygen are coupled with the biologically driven carbon dynamics, as each group has dynamically varying C:N:P ratios. The uptake of dissolved inorganic nitrogen and phosphorus by phytoplankton is regulated based on the difference between internal and external nutrient pools, following a Droop kinetics formulation (Droop, 1974) to describe nutrient limitation, allowing for luxury uptake. Since there is no internal storage of silicate, the diatom growth is further regulated by a Michaelis-Menten function of the external availability of dissolved silica. The temperature dependence on plankton metabolic rates is described by a Q10 exponential function. Finally, the benthic-pelagic coupling is described by a simple first-order benthic return module that describes the diffusional nutrient fluxes out of the sediment, which are resulted from the remineralization of the deposited accumulated organic matter into the benthos. A different remineralization rate is assumed for the labile and semi-labile fraction of the deposited organic matter.
The parameter set used in the 3-D Mediterranean simulation has been adopted from Petihakis et al. (2002) and Tsiaras et al. (2014Tsiaras et al. ( , 2017b, where a more detailed description of the model can be found. The model parameter values are given in the Supplementary Material (Tables S1-S8). The bacteria sub-model was revised from the one in Petihakis et al. (2002), as described in Petihakis et al. (2014), following Anderson and Williams (1998) and Petihakis et al. (2009), allowing for a more realistic representation of the dissolved organic matter (DOM) pool.
After a 5-year model spin-up with the 3-D on-line coupled model, a long-term simulation was performed over 1990-2009 period, during which in-situ data were mostly available. Given that most of this period was not covered by the POSEIDON atmospheric dataset (POSEIDON-I, 1997& POSEIDON-II 2005, the atmospheric forcing for this long-term simulation was obtained from HIRHAM5 regional climate model hindcast simulation over the same 20-year period (Christensen et al., 2007), driven by ERA-interim global reanalysis (Dee et al., 2011) downscaled to 1/10°resolution. The momentum, heat and freshwater fluxes at the air-sea interface were calculated using hourly fields of wind velocity (10 m), relative humidity (2 m), air temperature (2 m), precipitation, net incoming short-wave radiation and incoming long wave radiation, using a properly tuned bulk formulae set (Korres and Lascaratos, 2003).
The Dardanelles and water exchange regime was parameterized using a two-layer (BSW inflow) open boundary condition with prescribed transport rates and salinity (Nittis et al., 2006). Seasonally mean concentrations were adopted for the inflowing water dissolved inorganic nutrients (Tugrul et al., 2002), while annual mean values are adopted for ammonium, dissolved and particulate organic matter (Polat and Tugrul, 1996). An input of phytoplankton biomass was also applied at the Dardanelles mouth, based on a monthly climatology of available SeaWiFS (Sea-viewing Wide Field-of view Sensor) Chl-a data, estimated using the Ocean Chlorophyll 4-version 4 (OC4-v4) algorithm (O'Reilly et al., 1998).
The model has an open boundary west of the Gibraltar strait. The implemented open boundary conditions are as follows: zero gradient condition for the free surface elevation, a Flather (1976) boundary condition for the integrated (barotropic) velocity and a Sommerfeld radiation condition for the baroclinic velocities. An upstream advection scheme was used for temperature, salinity and biological tracers at the open boundary during outflow, while seasonal climatology is used to obtain temperature/salinity (MODB-MED4 data) and dissolved inorganic nutrients (MEDAR/MEDATLAS II database) during water inflow.
The major Mediterranean rivers (Po, Rhone, Ebro, Nile), along with the most important rivers in the Adriatic, the North Aegean and the Eastern Levantine Seas (25 rivers in total) were parameterised in the model. Mean climatology values, representative of the simulation period and seasonal cycles of (monthly) river water discharge and dissolved inorganic nutrient inputs were adopted from the hydrological/nutrient emission modelling of the Mediterranean drainage basin (Dumont et al., 2005;Ludwig et al., 2009). River inputs of dissolved and particulate organic carbon, silicate and ammonium were based on available data from the literature (Degobbis and Gilmartin, 1990;Skoulikidis et al., 1998;Moutin et al., 1998) and are constant throughout the year in the absence of seasonal cycle data.

Satellite-derived surface chlorophyll concentrations
The satellite remote sensing Chl-a data that were used for the model validation are 8-day global Level-4 SeaWiFS composite products, with 9 km horizontal resolution, processed and provided by the NASA Goddard Space Flight Centre (http://oceancolor.gsfc.nasa.gov) using the OC4-v4 algorithm, for the period 1998-'2007.

In-situ data
In-situ data of seasonal mean inorganic nutrients (i.e. nitrate, phosphate) and Chl-a were obtained from the SeaDataNet (Pan-European infrastructure for ocean and marine data management; http://www.seadatanet.org), and the MEDATLAS/2002 databases for the period 1990-'2015. Data were also collected from the two long-term multi parametric monitoring stations: DYFAMED [DYnamique des Flux Atmosphériques en MEDiterranée; http://www.obs-vlfr.fr/sodyf/ (43.25 N; 7.52 E); Coppola et al., 2016]; and POSEIDON E1-M3A [(35.66 N, 24.99 E), Petihakis et al., 2018], for the same period. In DYFAMED and POSEIDON E1-M3A observatories, nutrients and Chl-a are regularly measured on a monthly basis (fixed point stations, in-situ buoys). To compare with the model outputs, all in-situ data profiles were interpolated at the model depths.

Methodology
The model-simulated near-surface Chl-a was compared against seasonal mean (Winter: January-March, Spring: April-June, Summer: July-September, Autumn: October-December 1998-2007) climatological patterns of Chl-a satellite observations. The overall model skill in reproducing the basin average temporal variability and the annual mean spatial variability of Chl-a was summarized in a Taylor diagram . This graph provides a graphical representation of the correlation coefficient (R) between the model and observation maps (angular position), the model standard deviation (σ) normalized by the observations standard deviation and the unbiased root mean square difference (RMSD) of the errors between model and observation maps. If the model is perfect, the result of the comparison would lie along the X-axis, implying high value for R (close to 1), small RMSD (close to 0) and σ close to 1 [see Jolliff et al. (2009) for further details]. The modelling efficiency index (MEF = 1 -RMSD 2 /σ sat ; Stow et al., 2009) was also computed, providing a measure of the predictive skill of a model relative to an average of the observations. A MEF close to 1 indicates a close match between observations and model predictions. A negative MEF indicates a model that predicts more poorly than the average of observations, while a MEF of 0 reflects a model that predicts the observations as good as their corresponding average. To validate the model outputs, an objective environmental spatial division of the MS to clusters (eco-regions) was applied, using the Kmeans cluster analysis, based on the proposed MS eco-regionalization of D' Ortenzio et al. (2009). A detailed description of the K-means algorithm can be found in Hartigan and Wong (1979). The procedure of the regionalization of the model-simulated data is the following. Using the simulated model output, an annual climatology with 10-day mean time series of surface Chl-a, temperature and phosphate concentrations were generated for each pixel of the basin. The resulting climatology was logtransformed and then standardized, by removing the mean value and dividing with the standard deviation. Prior to the cluster analysis, a principal components analysis (PCA, Chatfield and Collins, 1980) was applied to project the variability of the data into fewer dimensions and remove any correlation between the time series. Using PCA in conjunction with K-means clustering has shown to improve results (Corte-Real et al., 1998;Montazerolghaem et al., 2016). Hence, 6 statistically independent dimensions were derived from the PCA (i.e. principal components -PC), which together explained 97% of the total variance. K-means clustering was then applied to the PC scores with data dimensions of 25,433 × 6 (number of grid points X selected PCs).
The number of clusters was defined before the K-means analysis. Using the R clusterCrit software package (Desgraupes, 2013), 42 validity index tests were applied on the dataset (Milligan and Cooper, 1985), measuring the dispersion within and between clusters. Based on this process, an optimal number of 4 clusters was found. The clusters stability was tested, using the Jaccard coefficient, an index of similarity between datasets (Hennig, 2007), adding an artificial noise or taking a subset of the original dataset (Hennig, 2007;D′ Ortenzio et al., 2009). Only clusters with a Jaccard coefficient equal to or greater than 0.7 were considered stable.

Model validation against satellite Chl-a observations
In this section, model outputs of near-surface (0-10 m) Chl-a mean seasonal climatology are compared against satellite Chl-a observations over the same period (Fig. 2). The generic POSEIDON ecological model successfully reproduced the observed spatial variability, characterized by high Chl-a concentrations in deep convection areas such as the central part of the Liguro-Provençal basin (North-West MS, Lévy et al., 1998) and the Southern Adriatic Sea (Gačić et al., 2002), and in areas receiving important lateral nutrient inputs. These areas include the Gulf of Lion (River Rhone), the North Adriatic (River Po), North Aegean (river and BSW discharge) and Alboran Seas (AW inflow), and the Nile runoff area. A difference, however, between model and observed Chl-a was found in the plume areas of the main rivers (Ebro, Rhone, Po, Nile), where the satellite Chl-a estimation may be affected by the presence of coloured dissolved and detrital organic materials (CDM; Siegel et al., 2013). Moreover, one has to consider that river nutrient loads were obtained from hydrological/nutrient emission modelling of the Mediterranean drainage basin (Dumont et al., 2005;Ludwig et al., 2009). Even though the latter has been validated against available observations, at least for the major Mediterranean rivers, there is still some uncertainty and potential bias in the adopted nutrient loads, particularly for phosphate, which has been obtained from total phosphorus that is usual available in river monitoring (Ludwig et al., 2009). A model deviation from the observed Chl-a was also found throughout the year in the Gulf of Gabes (southern Tunisia) that is probably related with the satellite bio-optical algorithm, overestimating the retrieved Chl-a, due to high bottom reflectance in this shallow water area (Barale et al., 2008). Besides this, the Gulf of Gabes is an area known to be severely impacted by local industries (Kateb et al., 2018) that enrich the marine ecosystem with phosphate, leading to eutrophication events, a nutrient input mechanism which is not parameterized in the model.
A clear seasonal cycle of the MS Chl-a was observed and reproduced by the model with maximum values in winter vertical mixing period and minimum values in summer. During winter in particular, the model reproduced the pronounced phytoplankton bloom in deep convection areas (e.g. NW basin). Such areas are usually characterized by cyclonic circulation that provides a favourable preconditioning for deep convective mixing and nutrient-rich waters upwelling (Casella et al., 2011).
However, a slight underestimation (10-20%) was simulated that is related to an underestimation of the deep winter vertical mixing by the hydrodynamic model (not shown). The model also tends to underestimate winter Chl-a concentrations in the North-East Aegean Sea, which may be attributed to the effect of the BSW discharge and the uncertainty of the adopted inorganic nutrients and dissolved organic matter in the North-East Aegean pelagic ecosystem. Chl-a concentrations start decreasing in spring, as the water stratification begins to establish, preventing nutrient supply from deeper layers. Finally, the model overestimated Chl-a in the eastern MS (by~0.05-0.1 mg m −3 ), particularly in the Levantine basin.
During summer, the establishment of a strong thermocline leaves the Mediterranean basin in highly oligotrophic conditions, with the eastern basin showing the minimum values of Chl-a (< 0.2 mg m −3 ). Areas such as the Gulf of Lion, the Alboran, North Aegean and Adriatic Seas that are affected by lateral nutrient inputs (rivers runoff, BSW, AW), were quite well represented by the model, with a slight underestimation. A new autumn increase of Chl-a concentrations is observed in satellite data that was captured by the model, with maxima in the Alboran and Algerian basins, the Gulf of Lion, the North Adriatic and North Aegean Seas, and minima in the Levantine basin.
The model-simulated interannual variability of the Chl-a concentration Fig. S1 and S2 in the Supplementary Material agrees reasonably well with the observed one (Pearson correlation coefficient R: WMED = 0.87, EMED = 0.75, MS = 0.75, Fig. 3). Although the modelled spring bloom started slightly later during 1999-2002 in the eastern basin, both spring blooms and Chl-a summer decrease were correctly localized in time. Following spring Chl-a peaks, the model generally tends to underestimate phytoplankton blooms in the western basin and overestimate in the eastern. This is mostly related to the hydrodynamic model underestimation/overestimation of the winter mixing intensity that appears to have an impact on the general dynamics of phytoplankton phenology (i.e. bloom or no bloom; Lavigne et al., 2015). However, this was expected as the variability of environmental drivers (i.e. atmospheric forcing, river discharges etc.) in natural systems (i.e. atmospheric forcing, river discharges etc.) is only approximately represented in the model.
The spatial distribution of the observed and model-simulated Chl-a concentrations are well correlated (R = 0.64, Fig. 3), while normalized model standard deviation was less than 1 (STD = 0.52), indicating a weaker spatial model variability, as compared to the observations. A quite similar, relatively large unbiased RMSD was found for the annual mean spatial variability and the basin average temporal evolution (spatial RMSD = 0.78, temporal RMSD = 0.68), while the positive modelling efficiency in terms of Chl-a spatial distribution and temporal evolution, suggests a good agreement with the satellite observations (spatial distribution MEF: 0.39, temporal evolution MEF:0.32).

Vertical distribution of Chl-a
The vertical map (0-120 m) of the average model-simulated Chl-a concentration (Fig. 4) along the longitudinal transect shown in Fig. 1, indicates the zonal gradient in terms of depth (shallower in the west and deeper in the east) of the DCM. In the western basin, the DCM was simulated at an average depth of approximately 50 m, which is in agreement with previous studies (Lavigne et al., 2015). In particular, the average DCM depth from the Alboran Sea to the Straits of Sicily was simulated at 40-60 m and the associated Chl-a concentration ranged from 0.5 to 0.3 mg m −3 . A deepening of the simulated DCM was found from the Straits of Sicily to the Levantine basin, wherein the associated Chl-a concentration ranged approximately from 0.15 to 0.25 mg m −3 . However, the DCM depth was estimated at 60-80 m in the Levantine basin, while field observations have shown that this can reach up to 100-120 m (Christaki et al., 2001;Dolan et al., 2002). Such model deviation is mostly related to the horizontal variability of light attenuation coefficient, being affected by substances, such as CDOM or NAP (Non-algal particles) that are higher in the western basin than in the eastern (Morel and Gentili, 2009   description of light attenuation based on satellite data to obtain a correct simulation of the DCM depth. Herein, it should be stressed out that the model is using also a fixed chlorophyll/carbon (Chl:C) ratio, which can vary significantly in oligotrophic and stratified conditions (Claustre and Marty, 1995). However, sensitivity experiments with the model using a variable Chl:C, based on the Geider photo-acclimation model (Geider et al., 1996), failed to show any deepening of the DCM in the eastern basin, apart from an increase in deep Chl-a concentration.  Table 1 indicates the mean value and standard deviation of modelled (average 0-10 m) and satellite SeaWiFS Chl-a, modelled surface temperature and phosphate in each cluster (eco-region). All clusters have scored high Jaccard coefficients (> 0.78, Table  S9 in Supplementary Material), demonstrating that are well defined and sufficiently stable. In general, the different clusters (eco-regions) were reasonably defined on the basis of the Mediterranean characteristics and particularly on the productivity. The most productive eco-region (Cluster #1, Fig. 5 and S3-S4 in Supplementary Material) encompasses areas with lateral water inputs of relatively low temperature and high nutrient and Chl-a concentrations (i.e. AW/BSW inflows, river discharges), characterized by relatively weak seasonal variability. The second most productive eco-region (Cluster #2, Fig. 5 and S3-S4 in Supplementary Materia)S2 covers most of the western basin, subjected to the nutrient enrichment from AW inflow and deep vertical mixing in the Ligurian-Provencal Sea cyclonic gyre area, characterized by intense winter-spring bloom and relatively low temperatures. It also includes areas not in the immediate vicinity, but influenced by river and BSW nutrient loads (i.e. North Adriatic and North Aegean Seas). Clusters #3 and #4 encompass the most oligotrophic regions of the basin, with the latter one characterized by ultra -oligotrophic conditions, showing the lowest values of Chl-a concentrations (< 0.2 mg m −3 ) due to particularly low phosphate concentration (< 0.04 mmolP/m 3 , Fig. S4) and relatively high temperatures (> 19 C°, Fig. 5 andS3). Accordingly, the modelled mean seasonal climatological Chl-a, nitrate and phosphate were validated against the available in-situ and satellite observations in each eco-region (Figs. 6 and 7 and Fig. S5-S16 in Supplementary Material), considering seasonal medians and inter-quartile ranges, instead of seasonal means due to the non-normal distribution of in-situ data.

Model validation against in-situ observations
Specifically, in the most productive eco-region of the basin (Cluster #1, Fig. 6 top-left, 7a-c, S5-S7) model median values of inorganic nutrients and Chl-a concentrations were found within the observational data range in most cases. However, inorganic nutrients were modelled slightly underestimated, especially for nitrate in the deeper layer. The model-simulated Chl-a also appeared slightly underestimated, particularly in the deeper layer (Fig. S5 in Supplementary Material, average 50-100 m), with simulated median values within the lowest observed range. This Chl-a model underestimation may be partly attributed to the nutrients underestimation, especially nitrate, which is the most limiting nutrient (N:P < 16 mmol N:mmol P, Fig. S17 in Supplementary Material) for the Alboran Sea (Lazzari et al., 2016) and Thermaikos Gulf (N. Aegean Sea, Tsiaras et al., 2014). In addition, the model failed to reproduce the secondary Chl-a peak during autumn (October) that is observed in the satellite data (Fig. 6 top-left). Those model deviations in the simulation of nutrients and subsequently Chl-a are related to the uncertainty of the adopted lateral nutrient inputs (river runoff, AW and BSW inflows) that characterize eco-region #1, in the absence of sufficient monitoring data for a more detailed parameterization of these nutrient inputs.
In the second most productive eco-region #2 (Cluster #2, Fig. 6 topright, 7 d-f and S8-S10), modelled inorganic nutrients and Chl-a distributions in the 0-100 m layer presented an overall good agreement; model medians were mostly within the data variability. Nonetheless, a slight model underestimation was observed for nutrients, particularly nitrates in the lower layer (50-100 m, Figs. S9 & S10)). The median model-simulated Chl-a concentration was in close agreement with data, except a model underestimation in the upper layer during winter period (Fig. 7d) that could be partly attributed to the aforementioned hydrodynamic model underestimation of vertical mixing dynamics. This is also suggested by the smaller range and maximum values of the simulated phytoplankton bloom during March, as compared to the satellite data, with model Chl-a being six times higher than the corresponding summer value, while satellite Chl-a is eight times higher ( Fig. 6 top-right). Despite that, median values of the simulated Chl-a were close to the observed ones, with higher values from January to April.
A better model agreement with the data was found in the oligotrophic eco-region #3 (Cluster #3, Fig. 6 bottom-left, 7g-i and S11-S13), with simulation results capturing the Chl-a and nutrients seasonality. The upper layer nutrients were slightly overestimated during winter, resulting to an overestimation of winter phytoplankton bloom. Again, this is due to the above-mentioned hydrodynamic model overestimation of vertical mixing in the eastern basin, as suggested also by the nutrient higher values, especially for phosphate, in the lower layer. Moreover, the model was initialized with inorganic nutrient data, which was produced with analytical methods characterized by relatively high detection limits (Rimmelin and Moutin, 2005). Given that nutrient concentrations in the EMED were often found below the detection limits of conventional nutrient autoanalyzers (i.e. 0.02 mmolP m −3 for phosphate and 0.4 mmolN m −3 for nitrate; Krom et al., 2005;Lazzari et al., 2016) and that the in-situ dataset compiled for the model validation includes measurements with past and current methods, some overestimation of deep water nutrients is expected. In any case, simulated concentrations are found in the range of observational values, showing a weaker variability. Moreover, one has to take into account that some Mediterranean areas that were clustered in eco-region #3, such as the Ionian Sea and the areas where the Rhodes and South Adriatic cyclonic gyres operate, were characterized by D' Ortenzio et al. (2009) as areas with intermittent blooms, exhibiting strong interannual variability, which is however weaker in the model simulation [i.e. smaller variability of winter Chl-a and nutrients especially in the lower layer (50-100 m)]. Finally, the model captured the ultra-oligotrophy of the eco-region #4 (Cluster #4, Fig. 6 bottom-right, 7j-l and S14-S16), reproducing the observed seasonal cycle of Chl-a, even though simulated median values were slightly higher during winter period. Moreover, the model Chl-a peak shows a slight time-lag (peak during March), as compared to the satellite data (peak during February). Inorganic nutrients seasonal cycle was also well presented by the model, simulating the almost nutrientdepleted conditions in this region. Similar to the eco-region #3, winter overestimated nutrients resulted to a Chl-a bloom overestimation, due to the aforementioned more intense model-simulated winter vertical mixing in the eastern basin. During summer, the mean modelled nearsurface Chl-a were close to the median value, while a large deviation is observed between the median and mean value of satellite data (i.e. mean is almost twice the median value), designating the limitations of the ocean colour bio-optical algorithms in areas characterized by very Table 1 Mean value and standard deviation of modelled (average 0-10 m) and satellite SeaWiFS Chl-a concentration, modelled surface temperature and phosphate in each eco-region (cluster).
Cluster Chl-a (mg m −3 , 0-10 m) Temperature (°C, 0 m) Phosphate (mmol m −3 , 0 m) Model SeaWiFS # 1 0.59 ± 0.65 0.81 ± 0.80 17.59 ± 3.66 0.08 ± 0.09 # 2 0.26 ± 0.12 0.28 ± 0.21 18.24 ± 3.66 0.05 ± 0.02 # 3 0.19 ± 0.10 0.18 ± 0.10 19.11 ± 3.55 0.04 ± 0.01 # 4 0.15 ± 0.08 0.17 ± 0.38 20.9 ± 3.75 0.04 ± 0.01 S. Kalaroni, et al. Deep-Sea Research Part II xxx (xxxx) xxxx low concentrations (< 0.15 mgChl-a m −3 ; Bricaud et al., 2002;Sancak et al., 2005, Fig. 2 in Kalaroni et al., 2016. The upper-layer modelsimulated phosphate concentrations were lower than in-situ data during the water-column stratified period (spring -autumn), which is consistent with the higher Chl-a concentration during the same period. Additionally, the upper-layer phosphate model underestimation could also be attributed to dust atmospheric deposition that might have a considerable impact on the biochemical properties of surface waters in the south-east MS (Christodoulaki et al., 2013); yet it is not included in the model parameterization of the 3-D configuration of the model. In the case of nitrates, the overestimation of the winter entrainment from the deep nutrient pool resulted in some excess nitrogen during summer and since the net primary production is limited by phosphorus availability (N:P > 16 mmol N:mmol P, Fig. S17 in the Supplementary Material) this excess is not consumed.

Model validation against observations at POSEIDON E1-M3A and DYFAMED stations
The model was validated against available in-situ data of phosphate and Chl-a in two contrasting, with regard to productivity and nutrients availability, Mediterranean sites: the DYFAMED (eco-region #2, Ligurian Sea) and POSEIDON E1-M3A (eco-region #3, Cretan Sea) fixed point multiparametric open ocean observatories. Given the limited time coverage of continuous monthly in-situ data profiles at all depths over the same years, a seasonal mean 'climatology' was constructed from the available observational data set. As in section 3.3., we considered seasonal medians and inter-quartile ranges for the comparison of model results with in-situ data.

DYFAMED
The seasonal evolution of the modelled phosphate ( Fig. 8 a-d), nitrate ( Fig. 8 e-h) and Chl-a ( Fig. 8 i-l) concentration presented an overall good agreement with the observed profiles. In the 0-50 m upper layer, the simulated values were within the inter-quartile observational range. Moreover, the model captured the well-established DCM at around 40-50 m during the stratified conditions (spring-summer-autumn period). However, in the lower layer just below DCM (50-100 m), where phytoplankton activity is limited, the simulated inorganic nutrients were underestimated, as compared to the in-situ data, reflecting to the underestimation of the subsurface Chl-a concentration (30-50 m). As mentioned above (Section 3.3), the underestimation of phosphate and nitrate in the euphotic layer is mostly related to the failure of the hydrodynamic model to simulate the winter deep convection phenomena taking place in the Ligurian Sea. However, a comparison with the 1D version of the model (Kalaroni et al., 2016) shows that 3-D modelling has a better performance, especially in the upper layer of the water column (0-50 m), mainly due to the horizontal processes that are missing from the 1-D model.-

POSEIDON EI-M3A
The model reproduced the oligotrophic conditions of the Cretan Sea, with the simulated profiles of inorganic nutrients and Chl-a concentration reasonably close to the observed ones (Fig. 9). Phosphate and nitrate were overestimated during winter and below 60 m (Fig. 9 ah) throughout the year. This is due to a small overestimation of the winter vertical mixing by the hydrodynamic model resulting to a more intense entrainment of deep nutrient-rich water. The simulated Chl-a concentration ( Fig. 9 i-l) appears systematically slightly overestimated in the upper layer (0-60 m), with the exception during the spring period, when modelled values fall inside the inter-quartile range. Another model deviation was found in the simulated DCM that was reproduced at a shallower depth (50-60 m, Fig. 9k), as compared to the observed one (100 m). This may be partly attributed to the overestimated near-surface phytoplankton, resulting in a decrease of subsurface water transparency and therefore a shallower DCM. However, as already mentioned in section 3.2., the simulated DCM depth in the Eastern Mediterranean is generally shallower, as compared to observations, which is mainly related to the light attenuation coefficient parameterization.

Conclusions
In the present study, the Mediterranean marine biogeochemistry and plankton ecosystem dynamics were simulated with the 3-D coupled generic POSEIDON ecosystem model. An extended skill assessment of the model was performed against satellite observations of Chl-a and a collection of in-situ datasets that can be considered as representative of the experimental knowledge of the MS inorganic nutrients (phosphate, nitrate) and Chl-a over the simulation period. A cluster K-means analysis was carried out to provide an a priori biogeographic subdivision of the Mediterranean basin, in order to efficiently validate the highly complex spatial-temporal outputs of the ecosystem model against observational data.
The clustering algorithm applied to model-simulated key biogeochemical variables of the ecosystem (i.e. sea surface temperature, Chl-a and phosphate) provides an insight to the governing mechanisms modulating the system. Thus, the most productive eco-region (Cluster #1) encompasses areas that receive significant lateral nutrient inputs, such as the Alboran Sea (AW inflow), major rivers and BSW discharge areas. Cluster #2, the second most productive eco-region, encompasses the entire western basin, including areas affected by cyclonic water circulation that usually promotes increased vertical mixing (i.e. Gulf of Lion, North-West MS), as well as the North Adriatic and North Aegean Seas that receive large riverine nutrient loads. Finally, clusters #3 (South Adriatic, Ionian and South Aegean Seas) and #4 (South-East MS) encompass the most oligotrophic eco-regions of the Mediterranean basin, characterized by low Chl-a, phosphate and nitrate concentrations and relatively higher temperatures.
The analysis of the 20-year simulation results (1990-2009) points out a number of strengths and weaknesses of the generic POSEIDON ecosystem model. Based on the model validation against in-situ observations for the basic ecosystem state variables (i.e. Chl-a, inorganic nutrients), the model reproduces the spatial gradient and the seasonal variability of the main biogeochemical features of the Mediterranean ecosystem, influenced by the seasonal variability of the thermocline and vertical mixing in the water column. On average, model results are in line with the structure and functioning of the Mediterranean key chemical components that are determined by the large scale west-toeast oligotrophic gradient. A subsurface Chl-a maximum concentration (DCM) is also simulated, close to the top of the nutricline, during the thermal stratification of the water column, although its depth is slightly underestimated in the EMED.
On the other hand, there are certain limitations in the model, which have been identified and analyzed. The model tends to underestimate Chl-a and inorganic nutrients in the more productive areas (eco-regions #1 and #2), due to an underestimation of the nutrient supply in the euphotic zone (river loads, winter vertical mixing). The control of winter mixing on the primary productivity plays an important role, since it is the main mechanism of the euphotic zone nutrient enrichment. Thus, future simulations of the coupled model with a higher resolution atmospheric forcing and/or an improved vertical mixing scheme are expected to give a better representation of the phytoplankton blooming areas. Furthermore, an improvement of the nutrient discharge parameterizations in the MS (river runoff, BSW and AW inflows) would limit the uncertainty in highly productive regions. S. Kalaroni, et al. Deep-Sea Research Part II xxx (xxxx) xxxx (caption on next page) Including atmospheric nutrient deposition in a 1-D version of the model (Christodoulaki et al., 2013) has also shown to affect the biochemical properties of the upper layer of the water column in the eastern basin. The incorporation of the particular mechanism in the current 3-D model setup and its parameterization is work in progress. Although model-simulated nutrients and Chl-a concentration values fall in the range of observational records, our validation results point out the largest variability of the natural system. Given that the variability in observations may reflect also the uncertainty of the experimental methods, further comparisons with more recent observations are clearly necessary, not only to confirm the model results but also for model tuning and calibration. Therefore, the model representativeness error could benefit from a more systematic in-situ monitoring.
The work presented here together with the validation results of model-simulated biological (plankton) components of the MS described in Part II companion paper (Kalaroni et al., 2019), suggest that the generic POSEIDON ecosystem model could be considered to be a good descriptor of the main biogeochemical properties of the pelagic ecosystem and constitutes an important tool to interpolate/extrapolate information, exploring the main biotic and abiotic components of the pelagic ecosystem. The proposed 3-D coupled model forms a valuable tool that can provide important knowledge on the Mediterranean marine ecosystem dynamics in line with the Global Ocean Observing System (GOOS) and its European component (EuroGOOS) themes, ocean health, services and climate change. Ecosystem functioning, the impact of environmental and anthropogenic pressures, the possible effects of climate in the marine environment and many others are all questions which the model can sufficiently address.

Acknowledgments
This work was funded under the EU projects OPEC (Operational Ecology, FP7/2007-2013 grant agreement ID: 283291) and CLAIM (Cleaning Litter by developing and Applying Innovative Methods in european seas, H2020 Grant agreement ID: 774586). We would like to thank the HCMR POSEIDON system team for providing the M3A data set as well as Eleni Dafnomili and Snezana Zivanovic for the analysis of nutrients. We also would like to thank Dionysios Raitsos for providing the SeaWiFS data and Charalampous Ioannis for valuable help in cluster analysis. The DYFAMED time series have been provided by the Oceanological Observatory of Villefranche-sur-Mer (L. Coppola). This project is funded by CNRS-INSU and ALLENVI through the MOOSE observing network.

Appendix A. Supplementary data
Supplementary data to this article can be found online at https:// doi.org/10.1016/j.dsr2.2019.104647.  . 9. Seasonal average profiles of the modelled (red charts) phosphate (a: winter, b: spring, c: summer, d:autumn), nitrate (e: winter, f: spring, g: summer, h: autumn) and Chl-a concentrations (i: winter, j: spring, k: summer, l: autumn) against in-situ seasonal average profiles at the POSEIDON-E1 M3A station. Horizontal lines refer to the first and third quartile (Q1-Q3 quartile) of in-situ data in each depth. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)