Evaluation of Multiple Satellite-Based Precipitation Products over Complex Topography

This study evaluates the performance of four satellite-based precipitation (SBP) products over the western Black Sea region of Turkey, a region characterized by complex topography that exerts strong controls on the precipitation regime. The four SBP products include the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis version 7 experimental near-real-time product (TMPA-7RT) and post-real-time research-quality product (TMPA-7A), the Climate Prediction Center morphing technique (CMORPH), and the Multisensor Precipitation Estimate (MPE) of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT). Evaluation is performed at various spatial (point and grid) and temporal (daily, monthly, seasonal, and annual) scales over the period 2007–11. For the grid-scale evaluation,araingauge–basedgriddedprecipitationdatasetwasconstructedusingaknowledge-basedsysteminwhich‘‘physiographicdescriptors’’areincorporatedintheprecipitationestimationthroughanoptimiza-tion framework. The resultsindicatedthatevaluatedSBPproductsgenerallyhad difﬁcultyin representing the precipitation gradient normal to the orography. TMPA-7RT, TMPA-7A, and MPE products underestimated precipitation along the windward region and overestimated the precipitation on the leeward region, more signiﬁcantly during the cold season. The CMORPH product underestimated the precipitation on both windward and leeward regions regardless of the season. Further investigation of the datasets used in the development of these SBPproducts revealedthat,althoughboth infrared(IR) and microwave (MW)datasets contain potential problems, the inability of MW sensors to detect precipitation especially in the cold season was the main challenge over this region with complex topography.


Introduction
The accuracy and reliability of any hydrologic study, whether related to flood forecasting, drought monitoring, water resources management, or climate change impact assessment, depend heavily on the availability of good-quality precipitation estimates. Rain gauges provide direct physical measurement of the surface precipitation; however, they are susceptible to certain errors arising from location, spatial scale (point), wind, mechanical errors, and density (Groisman and Legates 1994). Especially in remote parts of the world and in developing countries, ground-based precipitation measurements, such as rain gauge and radar networks, are either sparse or nonexistent, mainly because of the high cost of establishing and maintaining the infrastructure. This situation is further exacerbated in regions with complex topography, where precipitation is characterized by high spatiotemporal variability. In these regions, rain gauges are generally located in lowlands because of accessibility considerations, thus underrepresenting the precipitation occurring in highlands. Satellite-based precipitation (SBP) products are perhaps the only source to fill this important gap.
Recent improvements in SBP retrieval algorithms enabled representation of high space-time variability of precipitation fields with quasi-global coverage, thus making them potentially attractive for hydrologic modeling studies in data-sparse regions. Even though SBP measurements are quasi global, high resolution, and easily accessible, these products have certain limitations. SBP algorithms estimate precipitation rate based on one or more remotely sensed characteristics of clouds, such as reflectivity of clouds (visible), cloud-top temperature [infrared (IR)], and scattering effects of raindrops or ice particles [passive microwave (PMW)] (Kidd and Levizzani 2011). Visible and IR sensors are available on geostationary orbiting satellites; therefore, these sensors provide data at fine temporal scales. However, the link between cloud-top temperature and precipitation is indirect and often weak; hence, these sensors provide crude estimates. PMW sensors are available on polar-orbiting satellites, and although these sensors provide accurate estimates of precipitation, their temporal resolution is coarse. More recent algorithms combine measurements from multiple sources, such as IR, PMW, and rain gauges, to take advantage of the strengths of each source and provide more accurate and reliable precipitation estimates (Huffman et al. 2007(Huffman et al. , 2010Huffman 2013;Joyce et al. 2004;Aonashi et al. 2009;Sorooshian et al. 2000). Even though SBP estimates contain considerable errors, the ongoing improvements and future planned satellite missions make them potentially useful for hydrologic modeling studies in large basins (Yilmaz et al. 2005;Su et al. 2008;Thiemig et al. 2013).
SBP products are available with quasi-global coverage. However, their performance largely depends on the hydroclimatic characteristics of the region (Yilmaz et al. 2005), and thus, evaluation of these products in different regions will provide the expected error characteristics to the end users and feedback to the algorithm developers. There is an increasing number of studies focusing on the evaluation of the performance of SBP products (Ebert et al. 2007;Sapiano and Arkin 2009;Tian et al. 2007;Kidd et al. 2012). However, studies evaluating the performance of these algorithms over complex topography are still very limited.
The regions characterized by complex topography are among the most challenging environments for SBP estimation because of high spatiotemporal variability of precipitation controlled by the orography. SBP products that utilize information from a combination of IR and PMW sensors are faced with challenges over complex topography. The challenge for IR retrievals is mainly attributed to warm orographic rain, which cannot be detected by the IR retrievals that use cloud-top temperature, hence leading to an underestimation of orographic rains (Dinku et al. 2008) and a failure to capture light-precipitation events ). The underestimation by PMW retrievals over mountainous regions is attributed to warm orographic clouds without ice particles that produce heavy rain (Dinku et al. 2010). The overestimation by PMW retrievals over mountains can be related to the classification of cold land surface and ice covers as rain clouds (Dinku et al. 2007;Gebregiorgis and Hossain 2013). Because of this uncertainty associated with the land surface background emissivity, SBP algorithms are sensitized toward estimating liquid precipitation rather than frozen hydrometeors. SBP algorithms are prone to all these errors in mountainous regions and should therefore be evaluated in detail. Despite the importance of SBP products over complex topography, there are only a few studies that focus on evaluation of these products over mountainous regions. Hirpa et al. (2010) found that the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) near-real-time product (3B42RT; Huffman et al. 2007) and the Climate Prediction Center (CPC) morphing technique (CMORPH; Joyce et al. 2004) SBP products have similar performances at lower elevations; however, over higher elevations both products suffer from elevation-dependent bias. Dinku et al. (2010) compared CMORPH, TMPA 3B42, and TMPA 3B42RT over two mountainous regions that are characterized by complex topography. They found that both products have low correlations and underestimated the occurrence and amount of precipitation. Another study by Stampoulis and Anagnostou (2012) indicated that over mountainous regions in Europe, both CMORPH and TMPA 3B42V6 products significantly overestimate precipitation in the cold season because of snow/cold surface contamination and that CMORPH shows higher accuracy in winter relative to TMPA 3B42V6. Moreover, they noted that the error variance of the SBP products is seasonally dependent and generally higher over mountains. In summary, performances of SBP products vary significantly over topographically complex regions and are complicated by significant elevation change, seasonality, and snow cover.
The objective of this study is to evaluate multiple SBP products over the western Black Sea region of Turkey, which is characterized by complex topography. The effect of the complex topography on the performance of these products is studied using a rain gauge network and a rain gauge-based gridded dataset interpolated via a procedure considering physiographic controls on precipitation. Four different SBP products are evaluated: TMPA version 7 (Huffman et al. 2007(Huffman et al. , 2010Huffman 2013), including the experimental near-real-time monitoring product (TMPA-7RT) and post-real-time researchquality product (TMPA-7A); CMORPH (Joyce et al. 2004); and the Multisensor Precipitation Estimate (MPE; Heinemann et al. 2002) of the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT).
The present study differs from and complements previous studies in several aspects. First, the study area is characterized by a complex topography with significant orographic precipitation and a distinct rain shadow effect. Second, the evaluation is based on a gridded rain gauge dataset constructed using the ''physiographic similarity'' concept that is well-suited to regions with complex topography. Third, the TMPA products (TMPA-7RT and TMPA-7A) are retrospectively processed with the latest algorithm; hence, the performance of these new products that have uniform temporal error characteristics are presented. In addition, the source of the error in SBP products were further investigated via an analysis of the input data (IR and MW data) utilized in the development of these products.
The paper is organized as follows: the details of the study area and datasets are given in section 2. Evaluation methodology is presented in section 3. The results of the evaluation are presented in section 4, and the summary, conclusions, and recommendations are offered in section 5.

a. Study area
The study area is located in the western Black Sea region of Turkey, covering the area 408-42.258N, 30.758-34.58E (Fig. 1). The region is characterized by a complex topography, marked by northeast-southwest aligned mountain ranges running parallel to the shoreline. Mountains start immediately after the shoreline (the distance to the shoreline is between 5 and 50 km), with elevations reaching 2500 m. The study area is divided into two regions based on the climatic and topographic characteristics, both of which are predominantly influenced by the mountains; hence, the region boundary closely follows the orographic divide (Fig. 1). Region 1, located to the north of the mountains (windward) receives significant orographic precipitation and is characterized by the midlatitude humid temperate climate, whereas region 2 is located to the south of the mountains (leeward) and is characterized by a dry/subhumid continental climate. The distribution of mean monthly precipitation obtained from rain gauges located in regions 1 and 2 over the period 2007-11 is provided in Fig. 2. It can be seen from Fig. 2 that region 1 receives more precipitation throughout the year, most significantly during winter.

b. Rain gauge dataset
The rain gauge dataset was provided by the Turkish State Meteorological Service (TSMS). TSMS operates two types of meteorological stations in the study region: Automated Weather Observing Systems (AWOS) and pluviometer-type stations (Table 1). Data from the AWOS stations were available at hourly time scales, whereas the pluviometer-type stations report data three times a day. All pluviometer stations are collocated with an AWOS station, thus providing an opportunity for the quality control of the data. In the quality-control step, the consistency between the daily records of the collocated stations was first checked through graphical (double mass curves, time series, and scatterplot) and statistical (such as bias and correlation coefficient) methods. Later, these corrected AWOS stations were used to quality control similar stations nearby by using the correlation weighting method (Westerberg et al. 2010). The daily precipitation data from the qualitycontrolled AWOS stations were used in this study (Fig. 1). gauges where feasible, and data are available at 3-hourly, 0.258 3 0.258 latitude-longitude spatial resolution. There are two TMPA products: 1) an experimental real-time monitoring product that is available approximately 9 h after real time and covering the globe between 608N and 608S and 2) a post-real-time researchquality product available nearly 10-15 days after the end of each month and covering the globe between 508N and 508S. The real-time product makes use of TRMM's highest-quality observations, along with high-quality, PMW-based rain estimates from three to seven polarorbiting satellites and IR estimates from the international constellation of geosynchronous earth orbit satellites, all calibrated by information from TRMM. The post-real-time research-quality product differs from the experimental real-time monitoring product mainly in two ways: 1) it incorporates monthly rain gauge analysis for bias correction and 2) it uses the TRMM  Combined Instrument (TCI) precipitation product for calibration, as opposed to the TRMM Microwave Imager (TMI) used in the experimental real-time monitoring product. TRMM provides coverage from 408N to 408S; thus, the present study will highlight the performance of the TMPA products at latitudes higher than the TRMM coverage. The latest version of the TMPA products (version 7; Huffman 2013) was used in this study. In this new version, both TMPA products have been retrospectively processed by algorithm developers with an aim to improve the finescale patterns of precipitation during 2000-10 for the post-real-time research-quality product and from 2000 to late 2012 for the experimental realtime monitoring product (Huffman 2013). The TMPA-7A product includes reprocessed datasets used in the previous version (version 6) and has the following upgrades relevant to the current study: additional datasets were incorporated [e.g., Special Sensor Microwave Imager (SSM/I), Microwave Humidity Sounder (MHS), Meteorological Operational satellite program (MetOp), and the 0.078 Gridded Satellite (Gridsat) B1 infrared data]; single, uniformly processed surface precipitation gauge analyses were used, as computed by the Global Precipitation Climatology Centre (GPCC); and latitudeband calibration schemes were used for all satellites. Note that four stations used in this study, namely, ZNG, INB, BOL, KST, report to the GPCC and hence are already incorporated into the TMPA-7A product (see Table 1 for station names). Input data for both TMPA-7A and TMPA-7RT products were uniformly reprocessed in time for consistency (Huffman 2013).

2) CLIMATE PREDICTION CENTER MORPHING TECHNIQUE
CMORPH estimates precipitation from high-quality passive microwave satellite sensors, which are then propagated by motion vectors derived from more frequent geostationary satellite IR data. Advection vectors of cloud and precipitation systems over the globe are computed by successive IR observations. With the help of these advection vectors, infrequent MW observations are interpolated by ''moving'' the precipitation systems along the advection vectors in the combined time-space domain. The resulting product is a spatially and temporally complete microwave-derived precipitation analyses that is independent of the infrared temperature field (Joyce et al. 2004). In this study 3-hourly, 0.258 3 0.258 spatial resolution CMORPH data spanning 608N-608S were used.

3) MULTISENSOR PRECIPITATION ESTIMATE
The MPE algorithm estimates near-real-time precipitation rates by blending measurements from SSM/I with brightness temperatures from the IR channel of the Meteosat geostationary satellites (Meteosat-7, Meteosat-8, and Meteosat-9). SSM/I and Meteosat measurements are temporally and spatially coregistered to derive lookup tables (LUTs). LUTs describe the rain rate as a function of the Meteosat IR brightness temperature. The product is generated over the regions up to 608 longitude and latitude from the nominal subsatellite points of three satellites. Since MPE is produced on the assumption that cold clouds produce the most rain, product estimation is most effective for convective precipitation. Moreover, precipitation at warm fronts and orographically induced precipitation is usually detected but could be mislocated by up to 100 km (Heinemann et al. 2002). For this study, MPE product having a 15-min temporal and 4 km 3 4 km spatial resolution is used.

d. Rain gauge-based gridded precipitation dataset
Studies focusing on the evaluation of SBP estimates using rain gauge networks are hampered by the scale differences (grid versus point) between the two products. In an effort to reduce this scale-dependent inconsistency and to have a continuous rain gauge-based precipitation field, a rain gauge-based gridded precipitation (RGP) product has been constructed. The procedure for gridded precipitation estimation is based on the Precipitation-Elevation Regressions on Independent Slopes Model (PRISM; Daly et al. 2002Daly et al. , 2008Daly 2006). The advantage of the PRISM approach is that it incorporates physiographic descriptors in the precipitation estimation, thus providing a knowledgebased system in which statistical approaches and human expertise are combined in a semiautomated fashion (Daly et al. 2002). Our ultimate goal is to incorporate the influence of the complex topography on the precipitation estimation process. The PRISM approach is specifically developed for regions having low/moderate density of rain gauges and under the influence of significant topographic features, coastal effects, and rain shadows (Daly 2006), such as the study area selected for this study.
PRISM calculates a linear precipitation-elevation relationship for each grid cell, the slope of which changes locally by the physiographic similarity between observed and estimated point/grid. A moving-window procedure is used to calculate a unique climate-elevation regression function for each grid cell (Daly et al. 2008): where Y is the predicted precipitation, b 1 is the slope of the regression line, b 0 is the intercept of the regression line, and X is the elevation at the target cell obtained from the digital elevation model (DEM). The DEM with a 3-arc-s (0.000 838) resolution is obtained from the HydroSHEDS dataset (Lehner et al. 2006) and is further rescaled to 0.058 resolution via nearest neighbor interpolation; this is the grid scale that is used for the PRISM-based precipitation estimation.
In the PRISM approach to precipitation estimation, a locally weighted regression function is constructed for each grid. In the procedure, each station is assigned weights based on the physiographic similarity between the observed and estimated station/grid. The similarity or the combined weight W of a station/grid is a function of the following set of physiographic descriptors: where F d is the distance weighting importance factor, F z is the elevation weighting importance factor, W d is the distance weight, W z is the elevation weight, W p is the coastal proximity weight, W f is the facet weight, and W e is the effective terrain weight. These descriptors were selected based on the physiographic setting of the study area and the guidelines provided by Daly (2006). Note that all weights and importance factors, individually and combined, are normalized to sum to unity (Daly et al. 2008). The detailed description of these weights and their controlling parameters are provided in the appendix. These descriptors are controlled by a set of parameters. Daly et al. (2002) suggests default values for many of these parameters; however, the values of a few parameters are highly region dependent. In this study the values of these region-dependent parameters (F d , Dz m , Dz x , c, p x , h 2 , and h 3 in Table A1) were selected via an optimization procedure. In the optimization procedure, each rain gauge station is removed, one at a time, from the dataset, and the precipitation value for that station is estimated via the remaining stations using the PRISM approach. The shuffled complex evolution (SCE) algorithm (Duan et al. 1992) is then used to minimize the mean square error between PRISMestimated and observed monthly precipitation values for all stations. This procedure was used to estimate the seasonal [winter (December-February), spring (March-May), summer (June-August), and autumn (September-November)] PRISM parameters separately for regions 1 and 2. A comparison of the performance of the optimized PRISM parameters and the default PRISM parameters indicated that the optimization procedure increased the agreement between the observed and PRISM-estimated precipitation. As mentioned earlier, the TMPA-7A product incorporates four of the GPCC stations located in the study area in precipitation estimation. Hence, to minimize this bias during the evaluation process, these four stations should not be used in the PRISM parameter estimation process. However, two of these GPCC stations (KST and INB) are located in data-sparse regions, and their observations are deemed critical for the reliability of the estimation procedure and are thus included in the PRISM approach. The other two GPCC stations (BOL and ZNG) were not used in the estimation of gridded precipitation via the PRISM approach and left for independent data for evaluation. Table 2 lists the correlation coefficient and mean absolute bias statistics calculated using monthly precipitation observations from independent rain gauges and PRISMestimated monthly precipitation values using default and optimized parameters. It can be seen that the optimized PRISM parameters provided slightly improved statistics even for these independent rain gauges. The optimized PRISM parameters were then used to interpolate the precipitation values for each 0.058 3 0.058 grid within the study area at the daily time scale, assuming monthly PRISM parameters are also valid for the daily time scale. The PRISM interpolated grids were further coarsened to 0.258 resolution via boxaveraging technique.

Evaluation methodology
The primary objective of this study was to evaluate the performance of various SBP products over complex topography using a rain gauge network. The evaluation was performed at various spatial scales. First, point-scale precipitation measurements from the rain gauge network were compared with the collocated grid-scale precipitation estimates (0.258 3 0.258) from SBP algorithms. Second, the RGP product has been further utilized in the evaluation of the SBP products.
The 2007-11 period was selected on the basis of data availability. The evaluation methodology consisted of daily, monthly, seasonal, and annual time scales. In the procedure, the daily time steps marked as ''missing'' for a single product have been removed from the analysis.
The agreements between different precipitation products were investigated using quantitative and categorical statistics as well as graphical tools such as scatterplots. The quantitative statistics include percentage bias (%BIAS), linear correlation coefficient (CORR), and normalized root-mean-squared error (NRMSE): where SAT and RG represents SBP products and rain gauge-based (point scale or gridded) precipitation estimates, respectively, and i 5 1, 2, . . . , n is the number of daily or monthly precipitation data pairs for each grid. The contingency table-based categorical statistics measure the daily rain-detection capability and include false alarm ratio (FAR) and probability of detection (POD). These are based on a 2 3 2 contingency table (a: SAT yes, RG yes; b: SAT yes, RG no; c: SAT no, RG yes; and d: SAT no, RG no) constructed using RGP dataset and SBP products. The POD [a/(a 1 c)] gives the fraction of rain events that were correctly detected and ranges from 0 to 1, with 1 being the perfect score. The  FAR [b/(a 1 b)] measures the fraction of rain events that were actually false alarms and ranges from 0 to 1, with 0 being the perfect score.

a. Comparison of rain gauge versus satellite-based precipitation
To examine the influence of orography on the performance of the SBP products, cross-section lines were taken along and perpendicular to the mountain ranges ( Fig. 1). Figure 3 shows the annual precipitation from rain gauges and collocated SBP grids along cross-section line 1 together with the topographic elevations. Note that cross-section line 1 is perpendicular to the shore line; station BRT is in region 1 (on the coastal, windward side of the mountains) and other two stations are located in region 2 (on the drier, leeward side of the mountains). Cross-section lines 2 and 3 are taken along the coastal region (region 1) and inland region (region 2), respectively. For the sake of brevity, we summarize the results from all cross-section lines below using crosssection line 1. The influence of the orography on the precipitation distribution is clearly seen in Fig. 3, with station BRT receiving significantly more mean annual precipitation (850.5 mm) compared to stations KRA (408.1 mm) and CRK (344.2 mm) located inland in region 2. Along the coastal region (region 1), all SBP products underestimate observed precipitation. In this region, TMPA-7A performs better than other products with slight underestimation, possibly because of the monthly rain gauge correction procedure. CMORPH consistently and significantly underestimates the precipitation compared to rain gauges along the coast. MPE, on the other hand, shows underestimation with a wide range of scatter between years. TMPA-7RT underestimates along the coast, however, with less annual bias compared to CMORPH.
In region 2 (Fig. 3), CMORPH provides consistent annual precipitation estimates compared to rain gauges with slight underestimation. Both TMPA-7A and TMPA-7RT products overestimate the observed precipitation in region 2. The correction procedure employed within the TMPA-7A algorithm resulted in an improved product with less overestimation compared to the TMPA-7RT product.
The maps in Fig. 4 show the 5-yr (2007-11) mean annual precipitation values estimated by each SBP product at 0.258 spatial resolution together with the point-scale observations from the rain gauge network. Note that the region divide is shown by a red line. Starting with the TMPA-7RT product, it can be seen that the precipitation estimates are significantly less along the shoreline and significantly more inland compared to the rain gauges. Underestimation by TMPA-7RT and CMORPH along the shoreline is possibly due to the precipitation detection problems over water-land mixed cells (Huffman et al. 2007). TMPA-7RT produces heterogeneous precipitation estimates marked by a sharp change in precipitation amounts in neighboring cells that may be caused by direct replacement of PMWcalibrated IR estimates with PMW estimates whenever the latter is available. This behavior will be investigated in more detail in section 4c. In the TMPA-7A product, the gauge-based correction procedure seems to work well and improved the precipitation estimates with less significant underestimation along the shoreline and less significant overestimation inland. While it can be seen that the TMPA-7A precipitation estimates decrease going from the shoreline inland, as expected, the precipitation gradient is not as sharp as characterized by the rain gauges and marked by the region divide. CMORPH precipitation estimates are significantly lower and more uniform over the study area compared to other SBP products, thus underestimating the orographic precipitation along the shore more significantly compared to other products. In region 2 the CMORPH precipitation estimates are more consistent with the rain gauge observations compared to other SBP products. While TMPA and CMORPH products show precipitation patterns with north-south gradients (although with varying magnitudes) over the study region at the mean annual time scale, the spatial pattern of the MPE product differs with a decreasing precipitation trend from east to west. The MPE product is characterized by underestimation in region 1 and overestimation in region 2.
To investigate the performance of the SBP products in a more detailed manner, Fig. 5 shows a comparison of monthly precipitation from two region-representative rain gauges and their collocated SBP grids using scatterplots and quantitative statistics for the cold (September-February, black circles) and warm (March-August, gray triangles) seasons. Note that station AMS is located in region 1 and station DVN is located in region 2. These scatterplots show that CMORPH (more significantly) and MPE products suffer from a precipitation detection problem (points are scattered along the x axis) in region 1 during the cold season. On the other hand TMPA-7RT resulted in underestimation in region 1 and slight overestimation in region 2 during winter. TMPA-7A resulted in improved precipitation estimates compared to TMPA-7RT in both regions, the only exception being the overestimation in the region 2 winter season. In summary, TMPA-7A outperforms satellite-only SBP products at the monthly time scale as expected because of the gauge-correction procedure. Satellite-only SBP products generally suffer from a precipitation detection problem in region 1 during the cold season, with CMORPH being more and TMPA-7RT being less significant. In region 2, all SBP products overestimate monthly precipitation regardless of the season (more significantly during winter), with the only exception being CMORPH.
Comparison of daily precipitation from two regionrepresentative rain gauges and their collocated SBP grids (Fig. 6) indicate deteriorated SBP performance (marked by a wide scatter) compared to monthly time scale. The points located along the x axis and the y axis show missed and falsely detected daily precipitation events respectively, which are specifically important if these products will be used in modeling of floods. The underestimation of precipitation by CMORPH in region 1 (Fig. 6c) is mostly due to the consistently missed daily precipitation events, especially during the cold season. In region 2, however, CMORPH shows improved performance as indicated by the statistics and by the points located closer to the diagonal 1:1 line. Although the MPE product properly detected a few high daily precipitation events, it suffers from significant false detection and missed events in both regions and seasons. The TMPA-7RT product suffers from precipitation detection problems, whereas TMPA-7A showed falsely detected high daily precipitation estimates in region 1 during the cold season. In region 2, both TMPA products overestimate daily precipitation regardless of the season. sharp precipitation gradient caused by the mountain ranges well, with high precipitation values along the windward side of the mountains and low precipitation values on the leeward side.

b. Comparison of rain gauge-based gridded precipitation versus satellite-based precipitation
The box plot in Fig. 7 shows summary statistics calculated by comparing monthly precipitation estimates from RGP grids and their collocated SBP grids located in regions 1 and 2 during the cold and warm seasons. In these box plots, horizontal lines are the 25th and 75th percentiles and the median of the distribution; vertical lines represent the extent of the rest of the data, which is 1.5 times the 25th-75th percentile range; and outliers are represented by the ''1'' markers. In general, TMPA-7A, TMPA-7RT, and MPE products underestimate (negative %BIAS) precipitation in region 1 and overestimate (positive %BIAS) precipitation in region 2 regardless of the season. CMORPH results in more than 50% underestimation in both regions in the cold season. In the warm season, CMORPH is characterized by underestimation in region 1 and slight underestimation in region 2. Among satellite-only SBP products, TMPA-7RT shows better CORR with the RGP dataset in both regions and seasons. TMPA-7A performance is superior compared to satellite-only products in terms of CORR and NRMSE statistics, possibly because of the monthly correction procedure.
Focusing on the daily time scale (Fig. 8), it can be seen that the performance of the SBP products diminishes significantly. A general observation is that among all the SBP products, MPE shows the lowest performance in terms of CORR and NRMSE statistics. TMPA-7A shows a general improvement in performance compared to TMPA-7RT. CMORPH produced the highest CORR and lowest NRMSE statistics in region 2 during the cold season, which is possibly due to the surface snow and ice screening process embedded in the algorithm (Joyce et al. 2004;Xie et al. 2007); however, a significant negative %BIAS is evident. Overall, CMORPH shows the best daily statistics in region 2 in both seasons, with the exception of the significant negative bias in cold season. All SBP products underestimate RGP in region 1 and overestimate RGP in region 2 regardless of the season, with CMORPH being an exception, as it shows underestimation in both regions and seasons. The %BIAS values during the cold season range between 220% (TMPA-7A) and 282% (CMORPH) in region 1 and between 160% (TMPA-7RT) and 264% (CMORPH) in region 2. The %BIAS values during the warm season range between 222% (TMPA-7A) and 254% (CMORPH) in region 1 and between 140% (TMPA-7RT) and 215% (CMORPH) in region 2. Note that the performance of the TMPA-7A, TMPA-7RT, and MPE products over region 2 is deteriorated more significantly in the cold season compared to the warm season, possibly because of surface snow cover contamination.  Figures 9a and 9b compare the frequency of light precipitation and heavy precipitation reported by RGP and SBP products, respectively. The precipitation threshold values were selected based on an analysis of daily RGP precipitation distribution in both regions. While a 1-3 mm day 21 interval is a good representation of light precipitation in the study area, a 10-20 mm day 21 interval was deemed a good representation of heavy precipitation events. Focusing on the light precipitation detection capability, it can be seen that TMPA-7RT underestimates the number of days with light precipitation in region 1 regardless of the season. The correction algorithm employed in TMPA-7A further deteriorates this situation. In region 2, TMPA-7RT slightly underestimates the number of days with light precipitation in both seasons. Again, TMPA-7A further deteriorates this situation in the cold season and had minor improvements in the warm season. The CMORPH product slightly underestimated the number of days with light precipitation in the cold season regardless of region and resulted in similar light rain frequency with RGP dataset in the warm season. The number of days with light rain reported by the MPE product is consistent with the RGP dataset regardless of region and season.
Focusing on the heavy precipitation events (Fig. 9b), it can be seen that RGP dataset reported a significantly greater number of days with heavy precipitation in region 1 compared to region 2 in the cold season, whereas SBP products fail to discriminate this behavior. CMORPH reported significantly fewer days with heavy precipitation compared to the RGP dataset and other SBP products regardless of the region and season. The TMPA products significantly overestimated the number of days with heavy precipitation in region 2 and underestimated the number of days with heavy precipitation in region 1 regardless of the season. However, TMPA products showed the best performance among SBP products in detecting heavy precipitation in region 1. In general, the MPE product reported a similar number of days with heavy precipitation with RGP dataset in region 2 while it underestimated the number of days with heavy precipitation in region 1.
It can be seen that the CMORPH product suffers from heavy precipitation detection in region 1, especially in the cold season. This behavior can be partly attributed to the morphing algorithm; heavy precipitation events occurring in between infrequent PMW scans will likely be missed. Figure 10 shows the seasonal variation of categorical performance measures for precipitation magnitudes greater than 1 mm day 21 and for those greater than 9 mm day 21 . Note that the heavy precipitation threshold (9 mm day 21 ) is selected to ensure that the categorical measures for each grid combination are calculated based on at least four samples. It should be noted that POD and FAR are complementary measures and hence should be considered together to understand the performance trade-off between correctly detected observed precipitation and falsely estimated precipitation. In terms of precipitation detection, it can be seen from Figs. 10a and 10b that CMORPH has the lowest POD performance (less than 0.3) among the SBP products in the cold season, indicating precipitation detection problems in both regions. Because of this detection problem, CMORPH provided the best (lowest) FAR performance in the cold season in regions 1 and 2. MPE precipitation estimates are characterized by low POD values varying between 0.30 and 0.45 in regions 1 and 2, respectively, followed by worst (highest) FAR values among the compared SBP products. TMPA-7RT showed the best (highest) POD performance in both seasons and regions (0.4 in region 1 and 0.6 in region 2) followed by poor (high) FAR values (0.35) in region 2 and moderate FAR values (0.18) in region 1. Therefore, TMPA-7RT can be characterized by a detection problem (low POD and low FAR) in region 1 and an overestimation problem (high POD and high FAR) in region 2. The correction procedure included in TMPA-7A deteriorated the POD performance while slightly improving the FAR performance compared to the TMPA-7RT product. Focusing on the warm season, an improvement in POD and FAR performance of all SBP products is evident compared to the cold season for both regions. In the warm season, MPE product is characterized by low POD and FAR performance compared to other SBP products. As a summary, considered SBP products are characterized by varying degrees of precipitation detection problems in region 1, which are more significant in the cold season, and they showed improved precipitation detection performance in region 2, especially in the warm season.
In terms of heavy precipitation (Figs. 10c,d), both CMORPH and MPE products show significant deterioration in POD performance in the cold season. MPE further showed significant deterioration in FAR performance (FAR . 0.75) in both regions in the cold season. Evaluated SBP products are characterized by poor POD performance specifically in region 1, and they are characterized by poor FAR performance specifically in region 2 regardless of the season. TMPA products perform better than other SBP products in terms of detecting heavy precipitation; TMPA-7A outperforms the others.

c. Comparison of 3B40RT, 3B41RT, and MWCOMB
Since the datasets utilized in the development of these SBP products are different and have undergone various quality-control procedures, caution is needed while evaluating the performance of these products. For example, the TMPA-7RT algorithm produces rain rates by combining information from both MW and IR retrievals via a calibration procedure, whereas CMORPH and their corresponding collocated SBP product grids during the cold and warm seasons.
AUGUST 2014 D E R I N A N D Y I L M A Z produces rain rates solely from MW retrievals and propagates these temporally sparse retrievals via temporally rich IR retrievals. To investigate potential sources of errors in these algorithms, we further analyzed the IR and MW datasets used in developing these products. The TMPA-7RT product combines MW-only and IR-based precipitation estimates, named 3B40RT and 3B41RT, respectively. The 3B40RT is a merged microwave [TRMM Microwave Imager (TMI), Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E), SSM/I, Special Sensor Microwave Imager/Sounder (SSM/IS), Advanced Microwave Sounding Unit-B (AMSU-B), and MHS] precipitation estimate averaged at 0.258 3 0.258 spatial and 3-hourly temporal resolution. The 3B41RT is an IRbased precipitation estimate that converts 0.258 3 0.258 averaged IR brightness temperature to precipitation rates via a local space-time calibration procedure incorporating high-quality MW data. CMORPH product derives precipitation estimates at 0.258 3 0.258 spatial 3-hourly temporal resolutions by merging various microwave retrievals (TMI, SSM/I, and AMSU-B), which are then propagated in space by cloud motion vectors derived from IR images. This merged microwave product is called MWCOMB. In the procedure, the daily time steps marked as ''missing'' for a single dataset have been removed from the analysis. Note that time steps used in this analysis are different from those used in section 4b; hence, a comparison of the statistics between these two sections is not appropriate. Figure 11 shows summary statistics calculated by comparing daily precipitation estimates from RGP grids and their collocated 3B41RT, 3B40RT, and MWCOMB grids. It can be seen that the MW datasets used in TMPA and CMORPH products (3B40RT and MWCOMB, respectively) perform very similar in the cold season in both regions, indicated by CORR values less than 0.15, NRMSE values ranging between 1.5 and 2, and %BIAS values ranging between 275% and 2100%. The IR dataset used in TMPA-7RT (3B41RT) shows similar CORR and NRMSE performance with the MW datasets in region 1 during the cold season with improved (less negative) %BIAS performance. In region 2, however, 3B41RT shows higher performance in terms of CORR and %BIAS (also opposite sign with overestimation) and lower performance in terms of NRMSE (due to false alarms) as compared to MW datasets in the cold season. In the warm season, the MW dataset used by CMORPH performs slightly better than that used by TMPA in terms of CORR and %BIAS; however, it performs slightly poorer in terms of NRMSE. It can be concluded that the performance of the MW and IR datasets are similar in the warm season; moreover, the performance of the IR dataset is generally better than FIG. 8. As in Fig. 7, but for daily statistical results. the MW dataset in the cold season in region 2. The uniform nature of underestimation by CMORPH in region 2 can be attributed to the MW dataset used in the algorithm, and the spatially heterogeneous nature of TMPA-7RT product in region 2 can be explained by the differences in precipitation rates produced by IR and MW datasets used in this algorithm.

Summary and conclusions
Satellite-based precipitation (SBP) retrieval algorithms with quasi-global coverage are becoming increasingly attractive for hydrologic studies in regions with sparse ground-based networks and complex topography. The objective of this study was to evaluate the performance of multiple SBP products over the western Black Sea region of Turkey, a challenging region characterized by complex topography that exerts strong controls on the precipitation regime; for example, the precipitation values reduce by half on the leeward side of the orography compared to the windward side (within 50 km) because of the rain shadow effect. Four SBP products were evaluated, namely, TMPA-7A, TMPA-7RT, CMOPRH, and MPE, at various spatial (point and grid) and temporal (daily, monthly, seasonal, annual) scales over the 2007-11 period. The evaluation is based on rain gauge observations (point scale) and a rain gauge-based gridded precipitation (RGP) dataset constructed using a knowledge-based system in which ''physiographic descriptors'' known to control precipitation are incorporated in the precipitation estimation through an optimization framework. It should be noted that the TMPA-7A algorithm utilizes a rain gauge correction algorithm whereas other products contain satelliteonly information; hence, the former product has already an inherent advantage in the evaluation compared to other SBP products.
Focusing on the annual time scale, all evaluated SBP products underestimated precipitation along the coast (windward region) at various levels compared to rain gauge dataset. Among the satellite-only SBP products, CMORPH showed the most significant underestimation, whereas MPE was characterized by wide scatter in precipitation estimates across the years. The TMPA-7A product showed better performance than other SBP products, possibly due to rain gauge correction procedure. Along the drier (leeward) side of the orography, SBP products were generally characterized by overestimation of precipitation. CMORPH, being the only exception, provided slight underestimation of annual precipitation compared to the rain gauge dataset while producing the best performance at the annual scale on the leeward side of the orography. Investigation of the spatial distribution of the precipitation estimates over the study area showed that evaluated SBP products failed to represent the sharp precipitation gradient normal to the orography (rain shadow effect) revealed by the RGP dataset. In addition, the TMPA-7RT and CMORPH products suffered from precipitation detection problems over water-land mixed cells along the shore (Huffman et al. 2007) and the TMPA-7A algorithm was able to correct for this problem. TMPA products also showed more heterogeneity in spatial distribution of precipitation, due possibly to the approach used in these algorithms for merging MW and IR-based estimates (Huffman et al. 2007;Dinku et al. 2007).
The seasonal dependency of the performance of the SBP products was investigated at the monthly time scale. It was found that underestimation (overestimation) of RGP precipitation by SBP products on the windward (leeward) side is generally characteristic for both the warm and cold seasons, but is more pronounced for the cold season. CMORPH, being an exception, underestimated Analysis at the daily time scale revealed that the CMORPH product, and partly the MPE product, suffered from daily precipitation detection problems specifically in the cold season and the windward region. One of the possible reasons for this behavior of the CMORPH product could be the surface snow and ice screening procedure embedded in the algorithm (Joyce et al. 2004), because of the fact that MW sensors largely fail to discriminate between frozen hydrometeors and surface snow and ice (Dinku et al. 2007;Gebregiorgis and Hossain 2013). Another possible reason applicable to all investigated SBP products for both the warm and cold seasons is the limitations in spatial and temporal sampling by the MW sensors; a large footprint of MW sensors would not likely detect small-scale precipitation events, and the long time interval in between MW scans will likely result in missed precipitation. Since CMORPH estimates precipitation rates using MW-only sensors, these situations will likely be exacerbated. Underestimation by SBP products on the windward side is possibly due to warm orographic precipitation that cannot be detected by any passive (MW or IR) sensor (Scofield and Kuligowski 2003;Petty and Krajewski 1996;Dinku et al. 2007Dinku et al. , 2010. Overestimation by TMPA and MPE products on the leeward (drier) side could be attributed to a number of limitations by MW and IR sensors. Overestimation by MW sensors is likely due to surface snow/ice contamination (Stampoulis and Anagnostou 2012) in the cold season. Overestimation by IR sensors is possibly due to overestimation of both area and magnitude of summer convective precipitation (Scofield and Kuligowski 2003), which usually occupies only a small fraction of cold cloud area detected by the sensor. Further, IR-based techniques may overestimate because of misidentification of some cold clouds, such as cirrus, that may not generate any rainfall (Kidd 2001). In summary, the TMPA-7A product outperformed satellite-only SBP products at the monthly time scale as expected because of the monthly gauge correction procedure; our analysis, however, indicated that the correction procedure resulted in only a slight improvement in the precipitation estimates at the daily time scale compared to TMPA-7RT, hence indicating time-scale dependence of the correction procedure. The MPE product showed the poorest performance at the daily time scale compared to other tested SBP products. Evaluated satellite-only SBP products generally suffered from daily precipitation detection problems in the windward region during the cold season, with CMORPH being more and TMPA-7RT being less significant. It should be noted that the majority of the annual precipitation occurs during the cold season and is mainly in orographic character. The warm season is generally dry with a more convective-type precipitation pattern. Since the analysis focused on two seasons, it is likely that mixed types of precipitation occur on both ends of a season. Performance of the SBP products in estimating daily heavy precipitation is particularly important if these products will be utilized in flood monitoring. The RGP dataset showed a distinct difference in heavy precipitation frequency in the windward region (more frequent) compared to the leeward region, which is more significant in the cold season. However, the tested SBP products failed to discriminate this behavior, while most significantly underestimating heavy precipitation frequency in the cold season over the windward region. The CMORPH product showed significantly less frequent heavy precipitation events compared to RGP and other SBP products due to limitations with the MW datasets. TMPA products generally showed the best performance in capturing the frequency of heavy precipitation events, with a general behavior showing underestimation (overestimation) in the windward (leeward) region. In terms of capturing light precipitation frequency, SBP products generally showed underestimation in the cold season and better performance in the warm season; the MPE product showed the best result. The TMPA-7A product resulted in further underestimation of light precipitation frequency compared to the TMPA-7RT product, possibly because of the algorithm calibration procedure.
Focusing on daily precipitation detection, the considered SBP products are characterized by varying degrees of precipitation detection problems in the windward region, which are more significant in the cold season, and they showed improved precipitation detection performance (high POD and low FAR) in the leeward region, especially in the warm season. TMPA products generally outperformed other SBP products in terms of daily precipitation detection, with TMPA-7RT being slightly superior. However, the TMPA-7A product provided the best performance in detecting heavy precipitation. Note that the retrospectively processed TMPA-7RT product includes a number of archived datasets that are not available in near-real time; hence, the operational TMPA-7RT product may have different (somewhat worse) error characteristics (Huffman 2013).
To investigate the potential sources of these errors in SBP products, a further step was taken in which the MW datasets used in the TMPA-7RT and CMORPH products and the IR dataset (MW calibrated) utilized in the FIG. 11. Daily statistical results (a) CORR, (b) NRMSE, and (c) %BIAS obtained by comparing RGP grids in region 1 (gray box) and region 2 (white box) and their corresponding collocated 3B41RT, 3B40RT, and MWCOMB grids during the cold and warm seasons.
TMPA-7RT products were compared with the RGP dataset. Our results showed that the MW datasets used in these products perform similarly in the cold season and are responsible for significant precipitation underestimation (around negative 75%) in both leeward and windward regions. The IR dataset, on the other hand, performed slightly better with positive bias (around positive 25%) in the leeward region and less negative bias (around negative 50%) in the windward region compared to MW datasets. In the warm season, the MW dataset used in the CMORPH product showed higher performance compared to that of the TMPA-7RT product and showed similar bias and improved correlation performance compared to the IR dataset used in the TMPA-7RT product. Hence, the uniform nature of underestimation by the CMORPH product can be attributed to the significant underestimation by the MW dataset used in the algorithm, and the spatially heterogeneous nature of the TMPA-7RT product can be explained by the differences in precipitation rates produced by the IR and MW datasets used in this algorithm.
The results of our study indicated major challenges to satellite-based precipitation estimation algorithms over complex topography such as those related to orographic precipitation and precipitation estimation over cold surfaces. Since these challenges are intrinsic to the satellite sensors, improvements in the performance of these products would be possible through increasing the information content utilized by these algorithms via merging satellite-based information with ground-based precipitation observations (rain gauge network), ground-based precipitation estimates (weather radar), and physically based regional climate model simulations. Although a number of weather radars exist in Turkey, they are not yet calibrated to provide longterm rain rate information. Our future work will focus on development of merging algorithms that are specifically designed for regions with complex topography. Since all types of precipitation observation/estimation sensors, including the rain gauge network, contain errors, streamflow observations have the potential to provide an independent check on the degree of errors in the precipitation datasets that are important for hydrologic applications. Therefore, in a future work, we will drive a hydrologic model with the precipitation datasets used in this study to evaluate the performance of the hydrologic model in simulating the response of the watershed, that is, the streamflow. Turkey is characterized by complex topography, and this situation exerts strong controls on precipitation regime and climate. Hence, evaluation studies performed in different hydroclimatic regions in Turkey and elsewhere will further shed light on the utility of these products for use in hydrologic studies in ungauged regions with complex topography around the globe.