10.5281/zenodo.6598391
https://zenodo.org/records/6598391
oai:zenodo.org:6598391
Schulz, Christian
Christian
Schulz
0000-0001-9792-6958
Technische Universität Berlin, Geoinformation in Environmental Planning Lab
Ahlswede, Steve
Steve
Ahlswede
0000-0001-5605-1999
Technische Universität Berlin, Remote Sensing Image Analysis Group
Gava, Christiano
Christiano
Gava
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Smart Data and Knowledge Services
Helber, Patrick
Patrick
Helber
0000-0001-8454-4301
Vision Impulse GmbH
Bischke, Benjamin
Benjamin
Bischke
0000-0002-6473-3348
Vision Impulse GmbH
Arias, Florencia
Florencia
Arias
Technische Universität Berlin, Geoinformation in Environmental Planning Lab
Förster, Michael
Michael
Förster
0000-0001-6689-5714
Technische Universität Berlin, Geoinformation in Environmental Planning Lab
Hees, Jörn
Jörn
Hees
0000-0002-0084-8998
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Smart Data and Knowledge Services
Demir, Begüm
Begüm
Demir
0000-0003-2175-7072
Technische Universität Berlin, Remote Sensing Image Analysis Group
Kleinschmit, Birgit
Birgit
Kleinschmit
0000-0002-8494-0053
Technische Universität Berlin, Geoinformation in Environmental Planning Lab
TreeSatAI Benchmark Archive for Deep Learning in Forest Applications
Zenodo
2022
deep learning
forest
tree
image classification
remote sensing
aerial imagery
species mapping
sentinel-2
sentinel-1
multi-modal fusion
artificial intelligence
lower saxony
forest inventory
mid-Europe
2022-08-01
eng
10.5281/zenodo.6598390
https://zenodo.org/communities/treesatai
1.0.0
Creative Commons Attribution 4.0 International
Context and Aim
Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.
We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.
Description
The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.
The TreeSatAI Benchmark Archive contains:
50,381 image triplets (aerial, Sentinel-1, Sentinel-2)
synchronized time steps and locations
all original spectral bands/polarizations from the sensors
20 species classes (single labels)
12 age classes (single labels)
15 genus classes (multi labels)
60 m and 200 m patches
fixed split for train (90%) and test (10%) data
additional single labels such as English species name, genus, forest stand type, foliage type, land cover
The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS. For further information, we refer to the PDF document in the archive and publications in the reference section.
Version history
v1.0.0 - First release
Citation
Ahlswede et al. (in prep.)
GitHub
Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group (https://git.tu-berlin.de/rsim/treesat_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.
Folder structure
We refer to the proposed folder structure in the PDF file.
Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.
Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.
Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.
The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]
The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.
The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).
CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022),
Join the archive
Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.
Project description
This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TU Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).
Publications
Ahlswede et al. (2022, in prep.): TreeSatAI Dataset Publication
Ahlswede S., Nimisha, T.M., and Demir, B. (2022, in revision): Embedded Self-Enhancement Maps for Weakly Supervised Tree Species Mapping in Remote Sensing Images. IEEE Trans Geosci Remote Sens
Schulz et al. (2022, in prep.): Phenoprofiling
Conference contributions
S. Ahlswede, N. T. Madam, C. Schulz, B. Kleinschmit and B. Demіr, "Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods", IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.
C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, “Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series”, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.
C. Schulz, M. Förster, S. Vulova and B. Kleinschmit, “The temporal fingerprints of common European forest types from SAR and optical remote sensing data”, AGU Fall Meeting, New Orleans, USA, 2021.
B. Kleinschmit, M. Förster, C. Schulz, F. Arias, B. Demir, S. Ahlswede, A. K. Aksoy, T. Ha Minh, J. Hees, C. Gava, P. Helber, B. Bischke, P. Habelitz, A. Frick, R. Klinke, S. Gey, D. Seidel, S. Przywarra, R. Zondag and B. Odermatt, “Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests”, Living Planet Symposium, Bonn, Germany, 2022.
C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, (2022, submitted): “Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series”, ForestSAT, Berlin, Germany, 2022.
TreeSatAI was funded by the Bundesministerium für Bildung und Forschung (BMBF) under the grant number 01IS20014A. We are grateful to the Niedersächsische Landesforsten for providing the forest management data, the inventory data and the aerial imagery.