Published May 7, 2025
| Version TreeAI.V1.2
Dataset
Restricted
TreeAI Global Initiative - Advancing tree species identification from aerial images with deep learning
Creators
-
1.
ETH Zürich, Department of Environmental Systems Sciences
-
2.
ETH Zürich
-
3.
ETH Zurich
-
4.
Swiss Federal Institute for Forest, Snow and Landscape Research
-
5.
University of Freiburg
-
6.
Leipzig University
- 7. Norwegian Institute for Bioeconomy Research
- 8. University of Copenhagen
-
9.
Wuhan University
-
10.
Czech University of Life Sciences Prague
-
11.
Technical University of Zvolen
Description
TreeAI - Advancing Tree Species Identification from Aerial Images with Deep Learning
Data Structure for the TreeAI Database Used in the TreeAI4Species Competition
The dataset is organized into two distinct challenges: Object Detection and Semantic Segmentation. Below is a more detailed description of the data for each challenge:
Object detection
The data are in the COCO format, each folder contains training and validation subfolders with images and labels with the tree species ID.
Tree species: 61 tree species (classes).
Training: Images (.png) and Labels (.txt)
Validation: Images (.png) and Labels (.txt)
Images: RGB bands, 8-bit. Further details (spatial resolution, labels, etc) are given in Table 1.
Labels: Prepared for object detection tasks. The number of classes varies per dataset, e.g. dataset 12_RGB_all_L has 53 classes, but species IDs are standardized across most datasets (except for 0_RGB_fL). The Latin name of the species is given for each class ID in the file named classDatasetName.xlsx.
Species class: the excel file “classDatasetName.xlsx” contains 4 columns Species_ID (Sp_ID), Labels (number of labels for training and validation), and Species_Class (Latin name of the species).
Masked images: The dataset with partial labels was masked, i.e. a buffer of 30 pixels (1.5 m) was created around a label, and the image was masked based on these buffers. The masked images are stored in the `images_masked` folder within training and validation subsets, e.g. `34_RGB_ObjDet_640_pL_b\train\images_masked`.
Additional filters to clean up the data:
Labels at the edge: only images with labels at the edge were removed.
Valid labels: images with labels that were completely within an image have been retained.
Object detection dataset
Table 1. Description of the datasets for object detection included in the TreeAI database. Res. = spatial resolution.
a) Fully labeled images (i.e. the image has all the trees delineated and each polygon has species information)
b) Partially labeled images (i.e. the image has only some trees delineated, and each polygon has species information)
|
No. |
Dataset name |
Res. (cm) |
Training images |
Validation images |
Training labels |
Validation labels |
Fully labeled |
Partially labeled |
|
1 |
12_RGB_ObjDet_640_fL |
5 |
1061 |
303 |
53910 |
14323 |
x |
|
|
2 |
0_RGB_fL |
3 |
422 |
84 |
51500 |
11137 |
x |
|
|
3 |
34_RGB_ObjDet_640_pLa |
5 |
946 |
271 |
4249 |
1214 |
|
x |
|
4 |
34_RGB_ObjDet_640_pLb |
5 |
354 |
101 |
1887 |
581 |
|
x |
|
5 |
5_RGB_S_320_pL |
10 |
8889 |
2688 |
19561 |
5915 |
|
x |
Semantic segmentation dataset
Each folder contains training and validation subfolders with images and corresponding segmentation masks, where each pixel is assigned to a specific class.
Tree species: 61 tree species (classes).
Training: Images (.png) and Labels (.png)
Validation: Images (.png) and Labels (.png)
Images: RGB bands, 8-bit, 5 cm spatial resolution. Further details are given in Table 2.
Labels: Prepared for the semantic segmentation task. The number of classes varies per dataset, e.g. dataset `12_RGB_SemSegm_640_fL` has 57 classes, but the labels are standardized across both datasets.
Table 2. Description of the datasets for semantic segmentation included in the TreeAI database.
|
No. |
Dataset name |
Training images |
Validation images |
Fully labeled |
Partially labeled |
| a. |
12_RGB_SemSegm_640_fL |
1110 |
318 |
x |
|
|
b. |
34_RGB_SemSegm_640_pL |
1564 |
446 |
|
x |
Steps to access the dataset and participate in the TreeAI4Species competition:
- Register: Access to the data will be granted upon registering for the competition, see the registration form: https://form.ethz.ch/research/tree-ai-global-database/treeai-competition.html
- Download the dataset: Download the competition record after registration.
- Test dataset: Only the participants registered for the competition will receive the test dataset.
- Challenges: 1) object detection and 2) semantic segmentation.
- Submit your DL models for evaluation by July 2025.
- Award: The best models for object detection and semantic segmentation will win a prize.
- Publication: All participants in the competition who submit the required files for evaluation will be included in the subsequent publication.
License
== CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives) ==
Dear user,
We appreciate your interest in the TreeAI4Species Competition: https://form.ethz.ch/research/tree-ai-global-database.html
DATA ANALYSIS AND PUBLICATION
The TreeAI database is released under a variant of the CC BY-NC-ND license. This database is created for the TreeAI4Species data science competition. It is not permitted to pass on the data or the characteristics directly derived from it to third parties. Written consent from the data supplier is required for use for any other purpose.
LIABILITY
The data are based on the current state of existing scientific knowledge. However, there is no liability for the completeness. This is the second version of the database, and we might improve the tree annotations and include new tree species in future versions.
The data can only be used for the purpose described by the provider.
------------------------------------------------------
ETH Zürich
Dr. Mirela Beloiu Schwenke
Institute of Terrestrial Ecosystems
Department of Environmental Systems Science, CHN K75
Universitätstrasse 16, 8092 Zürich, Schweiz
mirela.beloiu@usys.ethz.ch
Files
Additional details
Related works
- Is referenced by
- Dataset: 10.5281/zenodo.15351054 (DOI)
Funding
- Swiss National Science Foundation
- Identifying tree species in RGB aerial images and terrestrial LiDAR using Deep Learning 213355
Dates
- Created
-
2025-05-08
References
- 10.5281/zenodo.15351054