Published February 15, 2024 | Version 1.00

Labelled dataset to classify direct deforestation drivers in Cameroon

  • 1. University of Cambridge
  • 2. International Institute for Sustainable Development
  • 3. ROR icon United Nations Development Programme
  • 4. Centre for Environment and Development (CED)
  • 5. Forêts et Développement Rural (FODER)
  • 6. ROR icon Joint Research Centre
  • 7. ARHS Developments Italia S.R.L.

Description

Overview

This dataset includes the images (visible bands for Landsat-8 or NICFI PlanetScope), auxiliary data (infrared, NCEP, forest gain, OpenStreetMap, SRTM, GFW), and data about forest loss (Global Forest Change) used to train, validate and test a model to classify direct deforestation drivers in Cameroon. 

Description of the files

  • 'my_examples_landsat_final_detailed.zip': Landsat-8 images, auxiliary data and forest loss data used to train, validate and test a model for a detailed classification of deforestation drivers in Cameroon (15 classes: ‘Oil palm plantation’, ‘Timber plantation’, ‘Fruit plantation (e.g. banana)’, ‘Rubber plantation’, ‘Other large-scale plantation (e.g. tea, sugarcane)’, ‘Grassland/Shrubland’, ‘Small-scale oil palm plantation’, ‘Small-scale maize plantation’, ‘Other small-scale agriculture’, ‘Mining’, ‘Selective logging’, ‘Infrastructure’, ‘Wildfire’, ‘Hunting’, ‘Other’)
  • 'my_examples_planet_final_detailed.zip': NICFI PlanetScope images, auxiliary data and forest loss data used to train, validate and test a model for a detailed classification of deforestation drivers in Cameroon (15 classes)
  • 'my_examples_landsat_final.zip': Landsat-8 images, auxiliary data and forest loss data used to train, validate and test a model for a classification of deforestation drivers by groups in Cameroon (4 classes: 'Plantation', 'Grassland/Shrubland', 'Smallholder agriculture', 'Other')
  • 'my_examples_planet_final.zip': NICFI PlanetScope images, auxiliary data and forest loss data used to train, validate and test a model for a classification of deforestation drivers by groups in Cameroon (4 classes)
  • 'my_examples_landsat_detailed_timeseries.zip': Landsat-8 images, auxiliary data and forest loss data used to test a model for a detailed classification of deforestation drivers in Cameroon (15 classes) using multiple images and a time series analysis 
  • 'my_examples_planet_detailed_timeseries.zip': NICFI PlanetScope images, auxiliary data and forest loss data used to test a model for a detailed classification of deforestation drivers in Cameroon (15 classes) using multiple images and a time series analysis
  • ‘labels.zip’: in csv files, the labels for each image in each folder described above (image identified by folder and coordinates or ‘path’) and matches the format of the csv files used as inputs to train, validate and test our classification model

    For ‘labels.zip’, we have subfolders for Landsat and PlanetScope. Then, for each type of imagery, we have subfolders for ‘detailed’, ‘groups’ and ‘time series’ which correspond to the different ‘my_examples’ folders listed above. 

    For each folder, subfolders named with the coordinates of the centre of the images contain each:
    •    A folder ‘images’, with a sub-folder ‘visible’ containing the PNG RGB image; and a sub-folder ‘infrared’ containing the infrared bands in a NPY file.
    •    A folder ‘auxiliary’ with topographic and forest gain information in a NPY format, OpenStreetMap and peat data in a JSON format, and a sub-folder ‘ncep’ containing all data from NCEP in a NPY format.
    •    The forest loss pickle file delimiting the area of forest loss.

Details about the images

  • For Landsat-8 data (courtesy of the U.S. Geological Survey), this dataset contains 332x 332 pixels RGB calibrated top-of-atmosphere (TOA) reflectance images pan-sharpened to a 15 m resolution (less than 20% cloud cover)

  • For NICFI PlanetScope data (catalog owner: Planet), this dataset contains 332x 332 pixels monthly RGB composite with a 4.77 m resolution

Details about the auxiliary data

  • Forest gain from GFC: 30-m resolution, yearly data for 2000-2021, downloaded via Google Earth Engine
  • Near infrared, shortwave infrared 1 and 2 bands from Landsat-8 TOA: 30-m resolution, data every 16 days for 2013-2023, downloaded via Google Earth Engine and selected using the same process as for Landsat-8 RGB images
  •  From NCEP Climate Forecast System Version 2 (CFSv2) 6-hourly Products: surface level albedo and volumetric soil moisture content (depths: 0.1 m, 0.4 m, 1.0 m, 2.0m) in 0.01%; radiative fluxes (clear-sky longwave flux downward and upward, clear-sky solar flux downward and upward, direct evaporation from bare soil, longwave and shortwave radiation flux downward and upward, latent, ground and sensible heat net flux), potential evaporation rate, and sublimation in W/m²; humidity (specific, maximum specific, minimum specific) in 10-4 kg/kg; ground level precipitation in 0.1 mm; air pressure at surface level in 10 Pa; wind level (u and v component) in 0.01 m/s, water runoff at surface level in 232.01 kg/ m²; temperature in K: 22264-m resolution, available four times a day for 2011-2023, downloaded directly from the NOAA website and selected the mean of the monthly mean over 5 years before the forest loss event, the monthly maximum over 5 years before the forest loss event, and the monthly minimum over 5 years before the forest loss event for each parameter
  • Closest street and closest city from OpenStreetMap in km: directly downloaded with the Nominatim API
  • Altitude in m, slope and aspect in 0.01° from  Shuttle Radar Topography Mission (SRTM): 30-m resolution, measured for 2000, downloaded via Google Earth Engine
  • Presence of peat from GFW: 232-m resolution, measured for 2017, directly downloaded on the GFW website

Details about Global Forest Change

For each image, there is a corresponding 'forest_loss_region' .pkl file delimiting a forest loss region polygon from Global Forest Change (GFC). GFC consists of annual maps of forest cover loss with a 30-m resolution. 

License

The NICFI PlanetScope images fall under the same license as the NICFI data program license agreement (data in 'my_examples_planet_final.zip', 'my_examples_planet_final_detailed.zip', 'my_examples_planet_detailed_timeseries.zip': subfolders '[coordinates]'>'images'>'visible'). 

OpenStreetMap® is open data, licensed under the Open Data Commons Open Database License (ODbL) by the OpenStreetMap Foundation (OSMF) (data in all 'my_examples' folders: subfolders '[coordinates]'>'auxiliary'>'closest_city.json'/'closest_street.json'). The documentation is licensed under the Creative Commons Attribution-ShareAlike 2.0 license (CC BY-SA 2.0).

The rest of the data is under a Creative Commons Attribution 4.0 International License. The data has been transformed following the code that can be found via this link: https://github.com/aedebus/Cam-ForestNet (in 'prepare_files').

 

Notes

For more details about how this dataset has been created and can be used, please refer to our paper and code: https://github.com/aedebus/Cam-ForestNet. The paper can be found here: https://www.nature.com/articles/s41597-024-03384-z. 

Citation: 

Debus, A. et al. A labelled dataset to classify direct deforestation drivers from Earth Observation imagery in Cameroon. Sci Data 11, 564 (2024).

Files

labels.zip

Files (5.0 GB)

Name Size
md5:61589af531e386a640c6e027969216f4
415.1 kB Preview Download
md5:98fe90d1b3df9253a33206c09487b207
988.5 MB Preview Download
md5:ff7cd5bca3bc21a5df4e10ed0b39451f
719.1 MB Preview Download
md5:49a1fa0a32e87353fa6aa9719dddb6ca
724.7 MB Preview Download
md5:adb6173f37db09ff3b5fbdac1e30862a
974.8 MB Preview Download
md5:732f4364d1e3f1a8d04bc50d2b620747
787.3 MB Preview Download
md5:43cb2d8a3c021eacbb16a1391f5e9aac
793.5 MB Preview Download

Additional details

Funding

UK Research and Innovation
C-CLEAR: Cambridge Climate, Life and Earth sciences DTP NE/S007164/1