Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

There is a newer version of the record available.

Published August 1, 2022 | Version 1.0.0
Dataset Open

TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

  • 1. Technische Universität Berlin, Geoinformation in Environmental Planning Lab
  • 2. Technische Universität Berlin, Remote Sensing Image Analysis Group
  • 3. Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Smart Data and Knowledge Services
  • 4. Vision Impulse GmbH

Description

 

Context and Aim

Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.

We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.

 

Description

The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.

The TreeSatAI Benchmark Archive contains:

  • 50,381 image triplets (aerial, Sentinel-1, Sentinel-2)

  • synchronized time steps and locations

  • all original spectral bands/polarizations from the sensors

  • 20 species classes (single labels)

  • 12 age classes (single labels)

  • 15 genus classes (multi labels)

  • 60 m and 200 m patches

  • fixed split for train (90%) and test (10%) data

  • additional single labels such as English species name, genus, forest stand type, foliage type, land cover

The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS.  For further information, we refer to the PDF document in the archive and publications in the reference section.

 

Version history

v1.0.0 - First release

 

Citation

Ahlswede et al. (in prep.)

 

GitHub

Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group  (https://git.tu-berlin.de/rsim/treesat_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.

 

Folder structure

We refer to the proposed folder structure in the PDF file.

  • Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.

  • Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.

  • Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.

  • The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]

  • The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.

  • The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).

CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022), 

 

Join the archive

Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.

 

Project description

This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TU Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).

 

Publications

Ahlswede et al. (2022, in prep.): TreeSatAI Dataset Publication

Ahlswede S., Nimisha, T.M., and Demir, B. (2022, in revision): Embedded Self-Enhancement Maps for Weakly Supervised Tree Species Mapping in Remote Sensing Images. IEEE Trans Geosci Remote Sens

Schulz et al. (2022, in prep.): Phenoprofiling

 

Conference contributions

S. Ahlswede, N. T. Madam, C. Schulz, B. Kleinschmit and B. Demіr, "Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods", IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, “Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series”, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

C. Schulz, M. Förster, S. Vulova and B. Kleinschmit, “The temporal fingerprints of common European forest types from SAR and optical remote sensing data”, AGU Fall Meeting, New Orleans, USA, 2021.

B. Kleinschmit, M. Förster, C. Schulz, F. Arias, B. Demir, S. Ahlswede, A. K. Aksoy, T. Ha Minh, J. Hees, C. Gava, P. Helber, B. Bischke, P. Habelitz, A. Frick, R. Klinke, S. Gey, D. Seidel, S. Przywarra, R. Zondag and B. Odermatt, “Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests”, Living Planet Symposium, Bonn, Germany, 2022.

C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, (2022, submitted): “Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series”, ForestSAT, Berlin, Germany, 2022.

Notes

TreeSatAI was funded by the Bundesministerium für Bildung und Forschung (BMBF) under the grant number 01IS20014A. We are grateful to the Niedersächsische Landesforsten for providing the forest management data, the inventory data and the aerial imagery.

Files

220613_doc_TreeSatAI_benchmark_archive.pdf

Files (16.3 GB)

Name Size Download all
md5:5378742583db6942825443d7d232fe89
2.1 MB Preview Download
md5:4298b1c9fbf6d0d85f7aa208ff5fe0c9
310.3 MB Preview Download
md5:7c31d7ddea841f6509deece8f984a79e
857.7 MB Preview Download
md5:34ea107f43c6172c6d2652dbf26306af
791.4 MB Preview Download
md5:69de9373739a027692a823846434fa0c
886.4 MB Preview Download
md5:8dffbb2f6aad17ef83721cffa5b52d96
1.2 GB Preview Download
md5:77b277e69e90bfbd3c5fd15a73d228fe
2.0 GB Preview Download
md5:9a88a8e6821f8a54ded950de9238831f
815.0 MB Preview Download
md5:aa0bc5b091b099018a078536ef429031
417.7 MB Preview Download
md5:429df073f69f8bbf60aef765e1c925ba
550.5 MB Preview Download
md5:edb9b1bc9a5a7b405f4cbb0d71cedf54
1.8 GB Preview Download
md5:96bf1798ef82f712ea46c2963ddb7083
124.5 MB Preview Download
md5:0ff818c6d31f59b8488880e49b300c7a
156.3 MB Preview Download
md5:298cbaac4d9f07a204e1e74e8446798d
2.0 GB Preview Download
md5:46fcff76b119cc24f3caf938a0bb433a
144.4 MB Preview Download
md5:fb1c570d3ea925a049630224ccb354bc
91.5 MB Preview Download
md5:2d05511ceabf4037b869eca928f3c04e
838.7 MB Preview Download
md5:31f573fb0419b2b453ed7da1c4d2a298
808.1 MB Preview Download
md5:bcd90506509de26692c043f4c8d73af0
1.1 GB Preview Download
md5:71d8495725ed1b4f27d9e382409fcc5e
576.3 MB Preview Download
md5:f81558c9c7189ac8a257d041ee43c1c9
64.1 MB Preview Download
md5:aa749718f3cb76c1dfc9cddc2ed201db
8.1 MB Preview Download
md5:d267e3579950eb2c95fa3cef77ba2371
581.1 kB Preview Download
md5:bed4fc8cb65da46a24ec1bc6cea2763c
320.2 MB Preview Download
md5:453ba69056aa33a3c6b97afb7b6afadb
510.0 MB Preview Download
md5:2166903d947f0025f61e342da466f917
184.9 kB Download
md5:a1a0148e8120b0268f76d2e98a68436f
1.7 MB Download