FOR-species20K dataset
Creators
- Puliti, Stefano (Contact person)1
- Lines, Emily2
- Müllerová, Jana3
- Frey, Julian4
- Schindler, Zoe5
- Straker, Adrian6
- Allen, Matthew J.2
- Lukas, Winiwarter7
- Rehush, Nataliia8
- Hristova, Hristina8
- Murray, Brent9
- Calders, Kim10
- Terryn, Louise10
- Coops, Nicholas9
- Höfle, Bernhard11
- Junttila, Samuli12
- Krucek, Martin13
- Krok, Grzegorz14
- Král, Kamil13
- Levick, Shaun R.15
- Luck, Linda16
- Missarov, Azim13
- Mokroš, Martin17
- Owen, Harry2
- Stereńczak, Krzysztof14
- Pitkänen, Timo18
- Puletti, Nicola19
- Saarinen, Ninni12
- Hopkinson, Chris20
- Torresan, Chiara21
- Tomelleri, Enrico22
- Weiser, Hannah11
- Astrup, Rasmus1
- 1. Norwegian Institute of Bioeconomy Research
- 2. University of Cambridge
- 3. Jan Evangelista Purkyně University in Ústí nad Labem
- 4. Albert-Ludwigs-Universität Freiburg
- 5. University of Freiburg
- 6. University of Göttingen
- 7. Universität Innsbruck
- 8. Swiss Federal Institute for Forest, Snow and Landscape Research
- 9. University of British Columbia
- 10. Ghent University
- 11. Heidelberg University
- 12. University of Eastern Finland
- 13. Silva Tarouca Research Institute for Landscape and Ornamental Gardening
- 14. Forest Research Institute
- 15. CSIRO Land and Water
- 16. Charles Darwin University
- 17. University College London
- 18. Natural Resources Institute Finland
- 19. Agricultural Research Council
- 20. University of Lethbridge
- 21. National Research Council
- 22. Free University of Bozen-Bolzano
Description
Description
Data for benchmarking tree species classification from proximally-sensed laser scanning data.
Data split and usage
The data is split into:
- Development data (dev): these includes 90% of the trees in the dataset and consists of individual tree point clouds (*.laz) named according to the treeID column available in the tree_metadata_dev.csv file, from which tree_species labels are available. These data are meant to be used for model development and can thus be further split into training and validation datasets.
- Test data (test): these are 10% of the trees (balanced sample) and include individual tree point clouds (*.laz) but, for benchmarking purposes, the species labels are witheld for benchmarking purposes. Thus to make use of the test data the users should predict species on the test trees, and output a table (.csv file) with a row per predicted tree and two columns (treeID and predicted_species). This table can then be used to create a new submission in the FOR-species20K Codabench benchmarking platform and obtain the evaluation metrics corresponding to the test data.
Cite
Any scientific publication using the data should cite the following paper:
Puliti, S., Lines, E., Müllerová, J., Frey, J., Schindler, Z., Straker, A., Allen, M.J., Winiwarter, L., Rehush, N., Hristova, H., Murray, B., Calders, K., Terryn, L., Coops, N., Höfle, B., Krůček, M., Krokm, G., Král, K., Luck, L., Levick, S.R., Missarov, A., Mokroš, M., Owen, H., Stereńczak, K., Pitkänen, T.P., Puletti, N., Saarinen, N., Hopkinson, C., Torresan, C., Tomelleri, E., Weiser, H., Junttila, S., and Astrup, R. (2024) Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset. ArXiv; available here
Files
dev.zip
Additional details
Dates
- Available
-
2024
Software
- Repository URL
- https://github.com/stefp/FOR-species
- Programming language
- Python
- Development Status
- Active