Integrated Approach to Global Land Use and Land Cover Reference Data Harmonization
Creators
- 1. Instituto Mauro Borges de Estatísticas e Estudos Socioeconômicos (IMB)
- 2. Laboratório de Processamento de Imagens e Geoprocessamento (LAPIG)
- 3. OpenGeoHub Foundation
Description
INTRODUCTION
This document outlines the creation of a global inventory of reference samples and Earth Observation (EO) / gridded datasets for the Global Pasture Watch (GPW) initiative. This inventory supports the training and validation of machine-learning models for GPW grassland mapping. This documentation outlines methodology, data sources, workflow, and results.
Keywords: Grassland, Land Use, Land Cover, Gridded Datasets, Harmonization
OBJECTIVES
-
Create a global inventory of existing reference samples for land use and land cover (LULC);
-
Compile global EO / gridded datasets that capture LULC classes and harmonize them to match the GPW classes;
-
Develop automated scripts for data harmonization and integration.
DATA COLLECTION
Datasets incorporated:
Datasets |
Spatial distribution |
Time period | Number of individual samples |
WorldCereal | Global | 2016-2021 | 38,267,911 |
Global Land Cover Mapping and Estimation (GLanCE) | Global | 1985-2021 | 31,061,694 |
EuroCrops | Europe | 2015-2022 | 14,742,648 |
GeoWiki G-GLOPS training dataset | Global | 2021 | 11,394,623 |
MapBiomas Brazil | Brazil | 1985-2018 | 3,234,370 |
Land Use/Land Cover Area Frame Survey (LUCAS) |
Europe | 2006-2018 | 1,351,293 |
Dynamic World | Global | 2019-2020 | 1,249,983 |
Land Change Monitoring, Assessment, and Projection (LCMap) |
U.S. (CONUS) | 1984-2018 | 874,836 |
GeoWiki 2012 | Global | 2011-2012 | 151,942 |
PREDICTS | Global | 1984-2013 | 16,627 |
CropHarvest | Global | 2018-2021 | 9,714 |
Total: 102,355,642 samples
WORKFLOW
Harmonization Process
We harmonized global reference samples and EO/gridded datasets to align with GPW classes, optimizing their integration into the GPW machine-learning workflow.
We considered reference samples derived by visual interpretation with spatial support of at least 30 m (Landsat and Sentinel), that could represent LULC classes for a point or region.
Each dataset was processed using automated Python scripts to download vector files and convert the original LULC classes into the following GPW classes:
0. Other land cover
1. Natural and Semi-natural grassland
2. Cultivated grassland
3. Crops and other related agricultural practices
We empirically assigned a weight to each sample based on the original dataset's class description, reflecting the level of mixture within the class. The weights range from 1 (Low) to 3 (High), with higher weights indicating greater mixture. Samples with low mixture levels are more accurate and effective for differentiating typologies and for validation purposes.
The harmonized dataset includes these columns:
Attribute Name | Definition |
dataset_name | Original dataset name |
reference_year | Reference year of samples from the original dataset |
original_lulc_class | LULC class from the original dataset |
gpw_lulc_class | Global Pasture Watch LULC class |
sample_weight | Sample's weight based on the mixture level within the original LULC class |
ACKNOWLEDGMENTS
The development of this global inventory of reference samples and EO/gridded datasets relied on valuable contributions from various sources. We would like to express our sincere gratitude to the creators and maintainers of all datasets used in this project.
REFERENCES
-
Brown, C.F., Brumby, S.P., Guzder-Williams, B. et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci Data 9, 251 (2022). https://doi.org/10.1038/s41597-022-01307-4Van Tricht, K. et al. Worldcereal: a dynamic open-source system for global-scale, seasonal, and reproducible crop and irrigation mapping. Earth Syst. Sci. Data 15, 5491–5515, 10.5194/essd-15-5491-2023 (2023)
-
Buchhorn, M.; Smets, B.; Bertels, L.; De Roo, B.; Lesiv, M.; Tsendbazar, N.E., Linlin, L., Tarko, A. (2020): Copernicus Global Land Service: Land Cover 100m: Version 3 Globe 2015-2019: Product User Manual; Zenodo, Geneve, Switzerland, September 2020; doi: 10.5281/zenodo.3938963
-
d’Andrimont, R. et al. Harmonised lucas in-situ land cover and use database for field surveys from 2006 to 2018 in the european union. Sci. data 7, 352, 10.1038/s41597-019-0340-y (2020)
-
Fritz, S. et al. Geo-Wiki: An online platform for improving global land cover, Environmental Modelling & Software, 31, https://doi.org/10.1016/j.envsoft.2011.11.015 (2012)
-
Fritz, S., See, L., Perger, C. et al. A global dataset of crowdsourced land cover and land use reference data. Sci Data 4, 170075 https://doi.org/10.1038/sdata.2017.75 (2017)
-
Schneider, M., Schelte, T., Schmitz, F. & Körner, M. Eurocrops: The largest harmonized open crop dataset across the european union. Sci. Data 10, 612, 10.1038/s41597-023-02517-0 (2023)
-
Souza, C. M. et al. Reconstructing Three Decades of Land Use and Land Cover Changes in Brazilian Biomes with Landsat Archive and Earth Engine. Remote. Sens. 12, 2735, 10.3390/rs12172735 (2020)
-
Stanimirova, R. et al. A global land cover training dataset from 1984 to 2020. Sci. Data 10, 879 (2023)
- Stehman, S. V., Pengra, B. W., Horton, J. A. & Wellington, D. F. Validation of the us geological survey’s land change monitoring, assessment and projection (lcmap) collection 1.0 annual land cover products 1985–2017. Remot Sensing environment 265, 112646, 10.1016/j.rse.2021.112646 (2021).
-
Tsendbazar, N. et al. Product validation report (d12-pvr) v 1.1 (2021).
- Tseng, G., Zvonkov, I., Nakalembe, C. L., & Kerner, H. (2021). CropHarvest: A global dataset for crop-type classification. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
Files
01_etl_lcmap-conus.ipynb
Files
(2.1 GB)
Name | Size | Download all |
---|---|---|
md5:10ff52b45e3cbcb428a61ced2ec30e76
|
46.5 kB | Preview Download |
md5:a5cfc91eda0c03a3a753c2a60a66ef8e
|
819.9 kB | Download |
md5:4e9888b36d2f07c6f91885526a212cf7
|
69.9 MB | Download |
md5:a30afd5ec452871b1d4e2d1b3248a71b
|
93.5 kB | Preview Download |
md5:84b45d4f7937cb4cad64092da4462938
|
57.0 kB | Preview Download |
md5:1d6af980b5299a7787102f746fa6c135
|
199.2 MB | Download |
md5:3c8aad73fac6531aa762933dd7fa5d2a
|
133.9 kB | Preview Download |
md5:8f0f466ec0b1634bb3229c83b1789cd2
|
4.0 MB | Download |
md5:63e8463ea8308da898b930e9a47efb65
|
105.1 kB | Preview Download |
md5:5c7fbc124baac0e00aa690e4dddf0096
|
163.4 MB | Preview Download |
md5:e9bce70350ac0eda468e083c975c9ace
|
82.2 kB | Preview Download |
md5:877fe97cd4206716f57ea13246eec4aa
|
51.9 MB | Download |
md5:b73f07770749a0397d4f334967098c5e
|
49.3 kB | Preview Download |
md5:0b112fede74bafa69fb8a57aa55d126e
|
1.3 MB | Preview Download |
md5:fa7b3c4667cb391d192583d0ad29c97d
|
2.9 kB | Preview Download |
md5:f834263a680b9aa2b946b5e4a36c7cbb
|
3.9 kB | Preview Download |
md5:ac1031e2ff39bda15f11bfacf328b015
|
208.1 kB | Preview Download |
md5:9c5e10193bf1b6dc9b719af7121efff0
|
1.2 kB | Preview Download |
md5:ada8c23821e5e8d3cbb705013e8a5313
|
402 Bytes | Preview Download |
md5:481e7e0650a93b622d5530ebc3a9b535
|
2.0 kB | Preview Download |
md5:0463a7120e343bffe30a4a064ebcc3a5
|
708 Bytes | Preview Download |
md5:e2dd08fa0eda8735d24bd5c16e5bb1d8
|
215 Bytes | Preview Download |
md5:371d891674d3214b5561800e04bd674f
|
3.7 kB | Preview Download |
md5:b419def8b9d91037f6c2cc4c3e1e01fd
|
928 Bytes | Preview Download |
md5:e36deaaf615184ac7d8eb1af199c8c6d
|
399 Bytes | Preview Download |
md5:6559bbdf14a84599228b3cbf8cbca4be
|
1.6 GB | Download |