A benchmark dataset for global evapotranspiration estimation based on FLUXNET2015 from 2000 to 2022 (V1.0)
Authors/Creators
- 1. Institute of RS and GIS, School of Earth and Space Sciences, Peking University, Beijing 100871, China
- 2. Beijing Key Laboratory of Spatial Information Integration and Its Applications, Beijing 100871, China
- 3. State Key Laboratory of Hydro-science and Engineering, Department of Hydraulic Engineering, Tsinghua University, Beijing 100084, China
- 4. Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
- 5. Key Laboratory of Earth Surface Processes and Regional Response in the Yangtze–Huaihe River Basin, School of Geography and Tourism, Anhui Normal University, Wuhu 241000, China
- 6. Faculty of Modern Agricultural Engineering, Kunming University of Science and Technology, Kunming 650500, China
Description
Our released data mainly contains four types of data:
(1) Half-hourly or hourly gap-filled LE data: The data are well gap-filled LE data using the novel bias-corrected RF algorithm. In the filenames, “HH” or “HR” indicate half-hourly or hourly scale data, respectively. The time information in the data files includes a pair of timestamps consistent with those in FLUXNET2015. The data are recorded at local time. The start time is “2000-02-18, 00:00:00”, and the end time is the same as the observation time at each site. For the quality control flags (QC), a value of 0 indicates observed data, while 1 indicates gap-filled data.
(2) Prolonged daily LE data: This dataset provides the prolonged daily LE data using the novel bias-corrected RF algorithm. The seamless data covers the period from February 18, 2000, to December 31, 2022. For the prolonged part, the quality flag is set to 2. The rest data is consistent with the aggregated daily LE data.
(3) Aggregated daily, monthly and yearly LE data: The hourly dataset is aggregated from the gap-filled half-hourly data to a daily scale. The start time is “2000-02-18”, and the end time is the same as the observation time at each site. Data quality control flags are also provided, with the values representing the percentage of hourly observations for each day. The monthly and yearly LE data are aggregated from the prolonged daily LE data. Quality control flags represent the proportion of days with more than 90% of hourly observations in a given month or a given year. No distinction is made between prolonged data and data with complete missing observations within a day. The start time for the monthly data is March 2000, and that for the yearly data is 2001.
All files are formatted as csv files. NDVI and debiased reference variables from ERA5-Land are also provided.
For more details of our data, please refer to a companion research article submitted to ESSD. Li, W., Yao, Z., Qu, Y., Yang, H., Song, Y., Song, L., Wu, L., and Cui, Y.: A benchmark dataset for global evapotranspiration estimation based on FLUXNET2015 from 2000-2022, Earth Syst. Sci. Data, under review, 2024.
Abstract
Evapotranspiration (ET) is a crucial component of the terrestrial hydrological cycle. Latent heat flux (LE, equivalent to ET in W/m2) observed by the eddy covariance (EC) technique, as known as LEEC, has been publicly recognized as highly accurate benchmark for global ET estimation. Currently, there is an increasing need for long time-series benchmark data to support climate change analysis, construction of new models, and validation of new products. However, existing LEEC datasets, like FLUXNET2015, face significant challenges due to limited observation periods and extensive data gaps. This hinders their application. To address these issues, we developed a gap-filling and prolongation framework for LEEC data and a benchmark dataset for ground-based ET from 2000 to 2022 across 64 sites is established. The framework mainly contained 3 parts: site selection and data pre-processing, gap-filled half-hourly / hourly LE data generation, and prolonged daily LE data generation. We selected 64 sites from FLUXNET2015 based on a rigorous filtering criteria. A novel bias-corrected random forest (RF) algorithm was used as the gap-filling and prolongation algorithm of the framework to produce seamless half-hourly and daily LE data. After analysis, the framework using novel bias-corrected RF algorithm achieves excellent performance both in hourly gap-filling and daily prolongation, with a median RMSE of 32.84 W/m2 and 16.58 W/m2. The algorithm significantly improved the gap-filling performance for long gaps and extreme values compared with the original RF and marginal distribution sampling (MDS) algorithm. The results demonstrate robust prolongation performance of our framework both on prolonging directions and temporal stability. There is a high consistency in data distribution between our gap-filled dataset and FLUXNET2015 dataset. In conclusion, a benchmark dataset for global ET estimation based on FLUXNET2015 from 2000-2022 was firstly published. This dataset can strongly provide data support for ET modelling, water-carbon cycle monitoring and climate change analysis.
Files
prolonged_daily_2000-2022.zip
Files
(586.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:ffe506c10bc5fbe6443368be7cf90221
|
13.9 MB | Preview Download |
|
md5:cd36eeadffb69e86db11bf25a5047387
|
1.4 MB | Preview Download |
|
md5:111db4e5b59076b3fe71295b670497c7
|
128.6 kB | Preview Download |
|
md5:a619fdd2faaac0a11871121fa1f54e9f
|
531.3 MB | Preview Download |
|
md5:1b63abe35d495e717e9c36dd8596400f
|
39.3 MB | Preview Download |