Published May 7, 2025 | Version TreeAI.V1.2
Dataset Restricted

TreeAI Global Initiative - Advancing tree species identification from aerial images with deep learning

  • 1. EDMO icon ETH Zürich, Department of Environmental Systems Sciences
  • 2. EDMO icon ETH Zürich
  • 3. ROR icon ETH Zurich
  • 4. ROR icon Swiss Federal Institute for Forest, Snow and Landscape Research
  • 5. ROR icon University of Freiburg
  • 6. ROR icon Leipzig University
  • 7. Norwegian Institute for Bioeconomy Research
  • 8. University of Copenhagen
  • 9. ROR icon Wuhan University
  • 10. ROR icon Czech University of Life Sciences Prague
  • 11. ROR icon Technical University of Zvolen

Description

TreeAI - Advancing Tree Species Identification from Aerial Images with Deep Learning

Data Structure for the TreeAI Database Used in the TreeAI4Species Competition

The dataset is organized into two distinct challenges: Object Detection and Semantic Segmentation. Below is a more detailed description of the data for each challenge:

Object detection

The data are in the COCO format, each folder contains training and validation subfolders with images and labels with the tree species ID.
Tree species: 61 tree species (classes). 
Training: Images (.png) and Labels (.txt)
Validation: Images (.png) and Labels (.txt)
Images: RGB bands, 8-bit. Further details (spatial resolution, labels, etc) are given in Table 1.
Labels: Prepared for object detection tasks. The number of classes varies per dataset, e.g. dataset 12_RGB_all_L has 53 classes, but species IDs are standardized across most datasets (except for 0_RGB_fL). The Latin name of the species is given for each class ID in the file named classDatasetName.xlsx.
Species class: the excel file “classDatasetName.xlsx” contains 4 columns Species_ID (Sp_ID), Labels (number of labels for training and validation), and Species_Class (Latin name of the species).
Masked images: The dataset with partial labels was masked, i.e. a buffer of 30 pixels (1.5 m) was created around a label, and the image was masked based on these buffers. The masked images are stored in the `images_masked` folder within training and validation subsets, e.g. `34_RGB_ObjDet_640_pL_b\train\images_masked`.
Additional filters to clean up the data:
Labels at the edge: only images with labels at the edge were removed.
Valid labels: images with labels that were completely within an image have been retained. 

 

Object detection dataset

Table 1. Description of the datasets for object detection included in the TreeAI database. Res. = spatial resolution.

a) Fully labeled images (i.e. the image has all the trees delineated and each polygon has species information)

b) Partially labeled images (i.e. the image has only some trees delineated, and each polygon has species information)

No.

Dataset name

Res. (cm)

Training images

Validation images

Training labels

Validation labels

Fully labeled

Partially labeled

1

12_RGB_ObjDet_640_fL

5

1061

303

53910

14323

x

 

2

0_RGB_fL

3

422

84

51500

11137

x

 

3

34_RGB_ObjDet_640_pLa

5

946

271

4249

1214

 

x

4

34_RGB_ObjDet_640_pLb

5

354

101

1887

581

 

x

5

5_RGB_S_320_pL

10

8889

2688

19561

5915

 

x

 

Semantic segmentation dataset

Each folder contains training and validation subfolders with images and corresponding segmentation masks, where each pixel is assigned to a specific class.
Tree species: 61 tree species (classes). 
Training: Images (.png) and Labels (.png)
Validation: Images (.png) and Labels (.png)
Images: RGB bands, 8-bit, 5 cm spatial resolution. Further details are given in Table 2.
Labels: Prepared for the semantic segmentation task. The number of classes varies per dataset, e.g. dataset `12_RGB_SemSegm_640_fL`  has 57 classes, but the labels are standardized across both datasets.
 
Table 2. Description of the datasets for semantic segmentation included in the TreeAI database.

No.

Dataset name

Training images

Validation images

Fully labeled

Partially labeled

a.

12_RGB_SemSegm_640_fL

1110

318

x

 

b.

34_RGB_SemSegm_640_pL

1564

446

 

x

 

Steps to access the dataset and participate in the TreeAI4Species competition:
  • Register: Access to the data will be granted upon registering for the competition, see the registration form: https://form.ethz.ch/research/tree-ai-global-database/treeai-competition.html 
  • Download the dataset: Download the competition record after registration.
  • Test dataset: Only the participants registered for the competition will receive the test dataset.
  • Challenges: 1) object detection and 2) semantic segmentation. 
  • Submit your DL models for evaluation by July 2025.
  • Award: The best models for object detection and semantic segmentation will win a prize.
  • Publication: All participants in the competition who submit the required files for evaluation will be included in the subsequent publication.

License

== CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives) ==
 
Dear user,
 
DATA ANALYSIS AND PUBLICATION
The TreeAI database is released under a variant of the CC BY-NC-ND license. This database is created for the TreeAI4Species data science competition. It is not permitted to pass on the data or the characteristics directly derived from it to third parties. Written consent from the data supplier is required for use for any other purpose. 
LIABILITY
The data are based on the current state of existing scientific knowledge. However, there is no liability for the completeness. This is the second version of the database, and we might improve the tree annotations and include new tree species in future versions.
The data can only be used for the purpose described by the provider.
 
------------------------------------------------------
ETH Zürich
Dr. Mirela Beloiu Schwenke
Institute of Terrestrial Ecosystems 
Department of Environmental Systems Science, CHN K75
Universitätstrasse 16, 8092 Zürich, Schweiz
mirela.beloiu@usys.ethz.ch

 

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Related works

Is referenced by
Dataset: 10.5281/zenodo.15351054 (DOI)

Funding

Swiss National Science Foundation
Identifying tree species in RGB aerial images and terrestrial LiDAR using Deep Learning 213355

Dates

Created
2025-05-08

References

  • 10.5281/zenodo.15351054