FOR-species20K dataset
Creators
-
Puliti, Stefano
(Contact person)1
-
Lines, Emily2
-
Müllerová, Jana3
-
Frey, Julian4
-
Schindler, Zoe5
-
Straker, Adrian6
- Allen, Matthew J.2
-
Lukas, Winiwarter7
-
Rehush, Nataliia8
- Hristova, Hristina8
-
Murray, Brent9
-
Calders, Kim10
-
Terryn, Louise10
-
Coops, Nicholas9
-
Höfle, Bernhard11
-
Junttila, Samuli12
-
Krucek, Martin13
-
Krok, Grzegorz14
-
Král, Kamil13
-
Levick, Shaun R.15
-
Luck, Linda16
-
Missarov, Azim13
-
Mokroš, Martin17
-
Owen, Harry2
-
Stereńczak, Krzysztof14
-
Pitkänen, Timo18
-
Puletti, Nicola19
-
Saarinen, Ninni12
-
Hopkinson, Chris20
-
Torresan, Chiara21
-
Tomelleri, Enrico22
-
Weiser, Hannah11
-
Astrup, Rasmus1
-
1.
Norwegian Institute of Bioeconomy Research
-
2.
University of Cambridge
-
3.
Jan Evangelista Purkyně University in Ústí nad Labem
- 4. Albert-Ludwigs-Universität Freiburg
-
5.
University of Freiburg
-
6.
University of Göttingen
-
7.
Universität Innsbruck
- 8. Swiss Federal Institute for Forest, Snow and Landscape Research
-
9.
University of British Columbia
-
10.
Ghent University
-
11.
Heidelberg University
-
12.
University of Eastern Finland
-
13.
Silva Tarouca Research Institute for Landscape and Ornamental Gardening
-
14.
Forest Research Institute
-
15.
CSIRO Land and Water
-
16.
Charles Darwin University
-
17.
University College London
- 18. Natural Resources Institute Finland
-
19.
Consiglio per la ricerca in agricoltura e l'analisi dell'economia agraria
-
20.
University of Lethbridge
-
21.
National Research Council
-
22.
Free University of Bozen-Bolzano
Description
Description
Data for benchmarking tree species classification from proximally-sensed laser scanning data.
Data split and usage
The data is split into:
- Development data (dev): these includes 90% of the trees in the dataset and consists of individual tree point clouds (*.laz) named according to the treeID column available in the tree_metadata_dev.csv file, from which tree_species labels are available. These data are meant to be used for model development and can thus be further split into training and validation datasets.
- Test data (test): these are 10% of the trees (balanced sample) and include individual tree point clouds (*.laz) but, for benchmarking purposes, the species labels are witheld for benchmarking purposes. Thus to make use of the test data the users should predict species on the test trees, and output a table (.csv file) with a row per predicted tree and two columns (treeID and predicted_species). This table can then be used to create a new submission in the FOR-species20K Codabench benchmarking platform and obtain the evaluation metrics corresponding to the test data.
Cite
Any scientific publication using the data should cite the following paper:
Puliti, S., Lines, E., Müllerová, J., Frey, J., Schindler, Z., Straker, A., Allen, M.J., Winiwarter, L., Rehush, N., Hristova, H., Murray, B., Calders, K., Terryn, L., Coops, N., Höfle, B., Krůček, M., Krokm, G., Král, K., Luck, L., Levick, S.R., Missarov, A., Mokroš, M., Owen, H., Stereńczak, K., Pitkänen, T.P., Puletti, N., Saarinen, N., Hopkinson, C., Torresan, C., Tomelleri, E., Weiser, H., Junttila, S., and Astrup, R. (2025) Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset. Methods in Ecology and Evolution, 00,1–18. Available here
Files
dev.zip
Additional details
Dates
- Available
-
2024
Software
- Repository URL
- https://github.com/stefp/FOR-species
- Programming language
- Python
- Development Status
- Active