Published October 18, 2024 | Version v4
Dataset Open

Integrated Approach to Global Land Use and Land Cover Reference Data Harmonization

  • 1. Instituto Mauro Borges de Estatísticas e Estudos Socioeconômicos (IMB)
  • 2. Laboratório de Processamento de Imagens e Geoprocessamento (LAPIG)
  • 3. OpenGeoHub Foundation

Description

INTRODUCTION

This document outlines the creation of a global inventory of reference samples and Earth Observation (EO) / gridded datasets for the Global Pasture Watch (GPW) initiative. This inventory supports the training and validation of machine-learning models for GPW grassland mapping. This documentation outlines methodology, data sources, workflow, and results.

Keywords: Grassland, Land Use, Land Cover, Gridded Datasets, Harmonization

 

OBJECTIVES

  • Create a global inventory of existing reference samples for land use and land cover (LULC);

  • Compile global EO / gridded datasets that capture LULC classes and harmonize them to match the GPW classes;

  • Develop automated scripts for data harmonization and integration.

 

DATA COLLECTION 

Datasets incorporated:

Datasets

Spatial distribution

Time period Number of individual samples
WorldCereal Global 2016-2021 38,267,911
Global Land Cover Mapping and Estimation (GLanCE) Global 1985-2021 31,061,694
EuroCrops Europe 2015-2022 14,742,648
GeoWiki G-GLOPS training dataset Global 2021 11,394,623
MapBiomas Brazil Brazil 1985-2018 3,234,370
Land Use/Land Cover
Area Frame Survey (LUCAS)
Europe 2006-2018 1,351,293
Dynamic World Global 2019-2020 1,249,983
Land Change Monitoring,
Assessment, and Projection (LCMap)
U.S. (CONUS) 1984-2018 874,836
GeoWiki 2012 Global 2011-2012 151,942
PREDICTS Global 1984-2013 16,627
CropHarvest Global 2018-2021 9,714

Total: 102,355,642 samples

 

WORKFLOW

Harmonization Process

We harmonized global reference samples and EO/gridded datasets to align with GPW classes, optimizing their integration into the GPW machine-learning workflow.

We considered reference samples derived by visual interpretation with spatial support of at least 30 m (Landsat and Sentinel), that could represent LULC classes for a point or region.

Each dataset was processed using automated Python scripts to download vector files and convert the original LULC classes into the following GPW classes:

       0. Other land cover

       1. Natural and Semi-natural grassland

       2. Cultivated grassland

       3. Crops and other related agricultural practices

We empirically assigned a weight to each sample based on the original dataset's class description, reflecting the level of mixture within the class. The weights range from 1 (Low) to 3 (High), with higher weights indicating greater mixture. Samples with low mixture levels are more accurate and effective for differentiating typologies and for validation purposes.

The harmonized dataset includes these columns:

Attribute Name Definition
dataset_name Original dataset name
reference_year Reference year of samples from the original dataset
original_lulc_class LULC class from the original dataset
gpw_lulc_class Global Pasture Watch LULC class
sample_weight Sample's weight based on the mixture level within the original LULC class

 

ACKNOWLEDGMENTS

The development of this global inventory of reference samples and EO/gridded datasets relied on valuable contributions from various sources. We would like to express our sincere gratitude to the creators and maintainers of all datasets used in this project.

 

REFERENCES

  • Brown, C.F., Brumby, S.P., Guzder-Williams, B. et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci Data 9, 251 (2022). https://doi.org/10.1038/s41597-022-01307-4Van Tricht, K. et al. Worldcereal: a dynamic open-source system for global-scale, seasonal, and reproducible crop and irrigation mapping. Earth Syst. Sci. Data 15, 5491–5515, 10.5194/essd-15-5491-2023 (2023)

  • Buchhorn, M.; Smets, B.; Bertels, L.; De Roo, B.; Lesiv, M.; Tsendbazar, N.E., Linlin, L., Tarko, A. (2020): Copernicus Global Land Service: Land Cover 100m: Version 3 Globe 2015-2019: Product User Manual; Zenodo, Geneve, Switzerland, September 2020; doi: 10.5281/zenodo.3938963

  • d’Andrimont, R. et al. Harmonised lucas in-situ land cover and use database for field surveys from 2006 to 2018 in the european union. Sci. data 7, 352, 10.1038/s41597-019-0340-y (2020)

  • Fritz, S. et al. Geo-Wiki: An online platform for improving global land cover, Environmental Modelling & Software, 31, https://doi.org/10.1016/j.envsoft.2011.11.015 (2012)

  • Fritz, S., See, L., Perger, C. et al. A global dataset of crowdsourced land cover and land use reference data. Sci Data 4, 170075 https://doi.org/10.1038/sdata.2017.75 (2017)

  • Schneider, M., Schelte, T., Schmitz, F. & Körner, M. Eurocrops: The largest harmonized open crop dataset across the european union. Sci. Data 10, 612, 10.1038/s41597-023-02517-0 (2023)

  • Souza, C. M. et al. Reconstructing Three Decades of Land Use and Land Cover Changes in Brazilian Biomes with Landsat Archive and Earth Engine. Remote. Sens. 12, 2735, 10.3390/rs12172735 (2020)

  • Stanimirova, R. et al. A global land cover training dataset from 1984 to 2020. Sci. Data 10, 879 (2023) 

  • Stehman, S. V., Pengra, B. W., Horton, J. A. & Wellington, D. F. Validation of the us geological survey’s land change monitoring, assessment and projection (lcmap) collection 1.0 annual land cover products 1985–2017. Remot Sensing environment 265, 112646, 10.1016/j.rse.2021.112646 (2021).
  • Tsendbazar, N. et al. Product validation report (d12-pvr) v 1.1 (2021).

  • Tseng, G., Zvonkov, I., Nakalembe, C. L., & Kerner, H. (2021). CropHarvest: A global dataset for crop-type classification. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.

Files

01_etl_lcmap-conus.ipynb

Files (2.1 GB)

Name Size Download all
md5:10ff52b45e3cbcb428a61ced2ec30e76
46.5 kB Preview Download
md5:a5cfc91eda0c03a3a753c2a60a66ef8e
819.9 kB Download
md5:4e9888b36d2f07c6f91885526a212cf7
69.9 MB Download
md5:a30afd5ec452871b1d4e2d1b3248a71b
93.5 kB Preview Download
md5:84b45d4f7937cb4cad64092da4462938
57.0 kB Preview Download
md5:1d6af980b5299a7787102f746fa6c135
199.2 MB Download
md5:3c8aad73fac6531aa762933dd7fa5d2a
133.9 kB Preview Download
md5:8f0f466ec0b1634bb3229c83b1789cd2
4.0 MB Download
md5:63e8463ea8308da898b930e9a47efb65
105.1 kB Preview Download
md5:5c7fbc124baac0e00aa690e4dddf0096
163.4 MB Preview Download
md5:e9bce70350ac0eda468e083c975c9ace
82.2 kB Preview Download
md5:877fe97cd4206716f57ea13246eec4aa
51.9 MB Download
md5:b73f07770749a0397d4f334967098c5e
49.3 kB Preview Download
md5:0b112fede74bafa69fb8a57aa55d126e
1.3 MB Preview Download
md5:fa7b3c4667cb391d192583d0ad29c97d
2.9 kB Preview Download
md5:f834263a680b9aa2b946b5e4a36c7cbb
3.9 kB Preview Download
md5:ac1031e2ff39bda15f11bfacf328b015
208.1 kB Preview Download
md5:9c5e10193bf1b6dc9b719af7121efff0
1.2 kB Preview Download
md5:ada8c23821e5e8d3cbb705013e8a5313
402 Bytes Preview Download
md5:481e7e0650a93b622d5530ebc3a9b535
2.0 kB Preview Download
md5:0463a7120e343bffe30a4a064ebcc3a5
708 Bytes Preview Download
md5:e2dd08fa0eda8735d24bd5c16e5bb1d8
215 Bytes Preview Download
md5:371d891674d3214b5561800e04bd674f
3.7 kB Preview Download
md5:b419def8b9d91037f6c2cc4c3e1e01fd
928 Bytes Preview Download
md5:e36deaaf615184ac7d8eb1af199c8c6d
399 Bytes Preview Download
md5:6559bbdf14a84599228b3cbf8cbca4be
1.6 GB Download