TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

Schulz, Christian; Ahlswede, Steve; Gava, Christiano; Helber, Patrick; Bischke, Benjamin; Arias, Florencia; Förster, Michael; Hees, Jörn; Demir, Begüm; Kleinschmit, Birgit

doi:10.5281/zenodo.6778154

Published August 1, 2022 | Version 1.0.1

Dataset Open

TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

1. Technische Universität Berlin, Geoinformation in Environmental Planning Lab
2. Technische Universität Berlin, Remote Sensing Image Analysis Group
3. Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Smart Data and Knowledge Services
4. Vision Impulse GmbH

Context and Aim

Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.

We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.

Description

The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.

The TreeSatAI Benchmark Archive contains:

50,381 image triplets (aerial, Sentinel-1, Sentinel-2)
synchronized time steps and locations
all original spectral bands/polarizations from the sensors
20 species classes (single labels)
12 age classes (single labels)
15 genus classes (multi labels)
60 m and 200 m patches
fixed split for train (90%) and test (10%) data
additional single labels such as English species name, genus, forest stand type, foliage type, land cover

The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS. For further information, we refer to the PDF document in the archive and publications in the reference section.

Version history

v.1.0.1 - Minor bug fixes in multi label JSON file and description file

v1.0.0 - First release

Citation

Ahlswede et al. (in prep.)

GitHub

Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group (https://git.tu-berlin.de/rsim/treesat_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.

Folder structure

We refer to the proposed folder structure in the PDF file.

Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.
Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.
Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.
The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]
The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.
The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).

CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022),

Join the archive

Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.

Project description

This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TU Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).

Publications

Ahlswede et al. (2022, in prep.): TreeSatAI Dataset Publication

Ahlswede S., Nimisha, T.M., and Demir, B. (2022, in revision): Embedded Self-Enhancement Maps for Weakly Supervised Tree Species Mapping in Remote Sensing Images. IEEE Trans Geosci Remote Sens

Schulz et al. (2022, in prep.): Phenoprofiling

Conference contributions

S. Ahlswede, N. T. Madam, C. Schulz, B. Kleinschmit and B. Demіr, "Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods", IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, “Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series”, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

C. Schulz, M. Förster, S. Vulova and B. Kleinschmit, “The temporal fingerprints of common European forest types from SAR and optical remote sensing data”, AGU Fall Meeting, New Orleans, USA, 2021.

B. Kleinschmit, M. Förster, C. Schulz, F. Arias, B. Demir, S. Ahlswede, A. K. Aksoy, T. Ha Minh, J. Hees, C. Gava, P. Helber, B. Bischke, P. Habelitz, A. Frick, R. Klinke, S. Gey, D. Seidel, S. Przywarra, R. Zondag and B. Odermatt, “Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests”, Living Planet Symposium, Bonn, Germany, 2022.

C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, (2022, submitted): “Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series”, ForestSAT, Berlin, Germany, 2022.

Notes

Acknowledgements TreeSatAI was funded by the Bundesministerium für Bildung und Forschung (BMBF) under the grant number 01IS20014A. We are grateful to the Niedersächsische Landesforsten for providing the forest management data, the inventory data and the aerial imagery.

Files

220629_doc_TreeSatAI_benchmark_archive.pdf

Files (16.3 GB)

Name	Size	Download all
220629_doc_TreeSatAI_benchmark_archive.pdf md5:4d6b87bde2e20bef81f325ca62ccbf22	2.1 MB	Preview Download
aerial_60m_abies_alba.zip md5:4298b1c9fbf6d0d85f7aa208ff5fe0c9	310.3 MB	Preview Download
aerial_60m_acer_pseudoplatanus.zip md5:7c31d7ddea841f6509deece8f984a79e	857.7 MB	Preview Download
aerial_60m_alnus_spec.zip md5:34ea107f43c6172c6d2652dbf26306af	791.4 MB	Preview Download
aerial_60m_betula_spec.zip md5:69de9373739a027692a823846434fa0c	886.4 MB	Preview Download
aerial_60m_cleared.zip md5:8dffbb2f6aad17ef83721cffa5b52d96	1.2 GB	Preview Download
aerial_60m_fagus_sylvatica.zip md5:77b277e69e90bfbd3c5fd15a73d228fe	2.0 GB	Preview Download
aerial_60m_fraxinus_excelsior.zip md5:9a88a8e6821f8a54ded950de9238831f	815.0 MB	Preview Download
aerial_60m_larix_decidua.zip md5:aa0bc5b091b099018a078536ef429031	417.7 MB	Preview Download
aerial_60m_larix_kaempferi.zip md5:429df073f69f8bbf60aef765e1c925ba	550.5 MB	Preview Download
aerial_60m_picea_abies.zip md5:edb9b1bc9a5a7b405f4cbb0d71cedf54	1.8 GB	Preview Download
aerial_60m_pinus_nigra.zip md5:96bf1798ef82f712ea46c2963ddb7083	124.5 MB	Preview Download
aerial_60m_pinus_strobus.zip md5:0ff818c6d31f59b8488880e49b300c7a	156.3 MB	Preview Download
aerial_60m_pinus_sylvestris.zip md5:298cbaac4d9f07a204e1e74e8446798d	2.0 GB	Preview Download
aerial_60m_populus_spec.zip md5:46fcff76b119cc24f3caf938a0bb433a	144.4 MB	Preview Download
aerial_60m_prunus_spec.zip md5:fb1c570d3ea925a049630224ccb354bc	91.5 MB	Preview Download
aerial_60m_pseudotsuga_menziesii.zip md5:2d05511ceabf4037b869eca928f3c04e	838.7 MB	Preview Download
aerial_60m_quercus_petraea.zip md5:31f573fb0419b2b453ed7da1c4d2a298	808.1 MB	Preview Download
aerial_60m_quercus_robur.zip md5:bcd90506509de26692c043f4c8d73af0	1.1 GB	Preview Download
aerial_60m_quercus_rubra.zip md5:71d8495725ed1b4f27d9e382409fcc5e	576.3 MB	Preview Download
aerial_60m_tilia_spec.zip md5:f81558c9c7189ac8a257d041ee43c1c9	64.1 MB	Preview Download
geojson.zip md5:aa749718f3cb76c1dfc9cddc2ed201db	8.1 MB	Preview Download
labels.zip md5:4f1bee76a87018147785fe73c7053cd1	581.1 kB	Preview Download
s1.zip md5:bed4fc8cb65da46a24ec1bc6cea2763c	320.2 MB	Preview Download
s2.zip md5:453ba69056aa33a3c6b97afb7b6afadb	510.0 MB	Preview Download
test_filenames.lst md5:2166903d947f0025f61e342da466f917	184.9 kB	Download
train_filenames.lst md5:a1a0148e8120b0268f76d2e98a68436f	1.7 MB	Download

	All versions	This version
Views	13,281	1,693
Downloads	33,167	5,327
Data volume	21.9 TB	4.0 TB

TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

Creators

Description

Notes

Files

220629_doc_TreeSatAI_benchmark_archive.pdf

Files (16.3 GB)