Labelled dataset to classify direct deforestation drivers in Cameroon: time series
Authors/Creators
Description
Overview
This dataset includes the images (visible and near-infrared bands for Landsat-8), auxiliary data (infrared, NCEP, forest gain, OpenStreetMap, SRTM, GFW), and data about forest loss (Global Forest Change) used to evaluate the performance a model to classify direct deforestation and degradation drivers in Cameroon (Cam-ForestNet) using images in multiple year following the forest loss event.
This dataset has been created following the same approach as another dataset, accessible here: https://zenodo.org/records/8325259. For more details about how this dataset has been created and can be used, please refer to our paper and code: https://github.com/aedebus/Cam-ForestNet. The paper can be found here: https://www.nature.com/articles/s41597-024-03384-z. Citation: Debus, A. et al. A labelled dataset to classify direct deforestation drivers from Earth Observation imagery in Cameroon. Sci Data 11, 564 (2024).
We used a related approach as the other dataset but, instead of looking at all the images in the five years following the forest loss event, we filter the data to select, for each forest loss polygon, the image with the lowest cloud cover in each of the following four years after the loss event. We only use images with a cloud cover below 20% and keep images where we have at least one image in each of the following four years after the loss event.
Description of the files
- 'my_examples_landsat_nir_per_year.zip': Landsat-8 images, auxiliary data and forest loss data used to test the performance of Cam-ForestNet with time series for a detailed classification of deforestation drivers in Cameroon (15 classes: ‘Oil palm plantation’, ‘Timber plantation’, ‘Fruit plantation (e.g. banana)’, ‘Rubber plantation’, ‘Other large-scale plantation (e.g. tea, sugarcane)’, ‘Grassland/Shrubland’, ‘Small-scale oil palm plantation’, ‘Small-scale maize plantation’, ‘Other small-scale agriculture’, ‘Mining’, ‘Selective logging’, ‘Infrastructure’, ‘Wildfire’, ‘Hunting’, ‘Other’). The subfolders use the suffixes 'first', 'second', 'third', 'fourth' to indicate when the image was taken in relation to the forest loss event, i.e. in the first, second, third or fourth year after the forest loss event.
- ‘labels.csv’: Csv file, where each image is identified by its path in 'my_examples_landsat_nir_per_year.zip' and given a label.
Details about the images
Landsat-8 data (courtesy of the U.S. Geological Survey), this dataset contains 332x 332 pixels RGB+NIR calibrated top-of-atmosphere (TOA) reflectance images with a 30 m resolution (less than 20% cloud cover). Each forest loss location is associated with four images in each consecutive year following the forest loss event.
Details about the auxiliary data
- Forest gain from GFC: 30-m resolution, yearly data for 2000-2021, downloaded via Google Earth Engine
- Near infrared, shortwave infrared 1 and 2 bands from Landsat-8 TOA: 30-m resolution, data every 16 days for 2013-2023, downloaded via Google Earth Engine and selected using the same process as for Landsat-8 RGB images
- From NCEP Climate Forecast System Version 2 (CFSv2) 6-hourly Products: surface level albedo and volumetric soil moisture content (depths: 0.1 m, 0.4 m, 1.0 m, 2.0m) in 0.01%; radiative fluxes (clear-sky longwave flux downward and upward, clear-sky solar flux downward and upward, direct evaporation from bare soil, longwave and shortwave radiation flux downward and upward, latent, ground and sensible heat net flux), potential evaporation rate, and sublimation in W/m²; humidity (specific, maximum specific, minimum specific) in 10-4 kg/kg; ground level precipitation in 0.1 mm; air pressure at surface level in 10 Pa; wind level (u and v component) in 0.01 m/s, water runoff at surface level in 232.01 kg/ m²; temperature in K: 22264-m resolution, available four times a day for 2011-2023, downloaded directly from the NOAA website and selected the mean of the monthly mean over 5 years before the forest loss event, the monthly maximum over 5 years before the forest loss event, and the monthly minimum over 5 years before the forest loss event for each parameter
- Closest street and closest city from OpenStreetMap in km: directly downloaded with the Nominatim API
- Altitude in m, slope and aspect in 0.01° from Shuttle Radar Topography Mission (SRTM): 30-m resolution, measured for 2000, downloaded via Google Earth Engine
- Presence of peat from GFW: 232-m resolution, measured for 2017, directly downloaded on the GFW website
Details about Global Forest Change
For each image, there is a corresponding 'forest_loss_region' .pkl file delimiting a forest loss region polygon from Global Forest Change (GFC). GFC consists of annual maps of forest cover loss with a 30-m resolution.
License
OpenStreetMap® is open data, licensed under the Open Data Commons Open Database License (ODbL) by the OpenStreetMap Foundation (OSMF) (data in all subfolders '[coordinates]'>'auxiliary'>'closest_city.json'/'closest_street.json'). The documentation is licensed under the Creative Commons Attribution-ShareAlike 2.0 license (CC BY-SA 2.0).
The rest of the data is under a Creative Commons Attribution 4.0 International License. The data has been transformed following the code that can be found via this link: https://github.com/aedebus/Cam-ForestNet (in 'prepare_files').
Files
labels.csv
Files
(8.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:73365b7166e0644d0e80229a66c3c7b7
|
1.1 MB | Preview Download |
|
md5:06f191031267592dc1db2fb09a1fbdcb
|
8.4 GB | Preview Download |
Additional details
Funding
- UK Research and Innovation
- C-CLEAR: Cambridge Climate, Life and Earth sciences DTP NE/S007164/1
Software
- Repository URL
- https://github.com/aedebus/Cam-ForestNet