Published August 7, 2024 | Version v1
Dataset Open

FOR-species20K dataset

  • 1. ROR icon Norwegian Institute of Bioeconomy Research
  • 2. ROR icon University of Cambridge
  • 3. ROR icon Jan Evangelista Purkyně University in Ústí nad Labem
  • 4. Albert-Ludwigs-Universität Freiburg
  • 5. ROR icon University of Freiburg
  • 6. ROR icon University of Göttingen
  • 7. ROR icon Universität Innsbruck
  • 8. Swiss Federal Institute for Forest, Snow and Landscape Research
  • 9. ROR icon University of British Columbia
  • 10. ROR icon Ghent University
  • 11. ROR icon Heidelberg University
  • 12. ROR icon University of Eastern Finland
  • 13. ROR icon Silva Tarouca Research Institute for Landscape and Ornamental Gardening
  • 14. ROR icon Forest Research Institute
  • 15. ROR icon CSIRO Land and Water
  • 16. ROR icon Charles Darwin University
  • 17. ROR icon University College London
  • 18. Natural Resources Institute Finland
  • 19. ROR icon Agricultural Research Council
  • 20. ROR icon University of Lethbridge
  • 21. ROR icon National Research Council
  • 22. ROR icon Free University of Bozen-Bolzano

Description

Description

Data for benchmarking tree species classification from proximally-sensed laser scanning data.

Data split and usage

The data is split into:

  • Development data (dev): these includes 90% of the trees in the dataset and consists of individual tree point clouds (*.laz) named according to the treeID column available in the tree_metadata_dev.csv file, from which tree_species labels are available. These data are meant to be used for model development and can thus be further split into training and validation datasets.
  • Test data (test): these are 10% of the trees (balanced sample) and include individual tree point clouds (*.laz) but, for benchmarking purposes, the species labels are witheld for benchmarking purposes. Thus to make use of the test data the users should predict species on the test trees, and output a table (.csv file) with a row per predicted tree and two columns (treeID and predicted_species). This table can then be used to create a new submission in the FOR-species20K Codabench benchmarking platform and obtain the evaluation metrics corresponding to the test data.

Cite

Any scientific publication using the data should cite the following paper:

Puliti, S., Lines, E., Müllerová, J., Frey, J., Schindler, Z., Straker, A., Allen, M.J., Winiwarter, L., Rehush, N., Hristova, H., Murray, B., Calders, K., Terryn, L., Coops, N., Höfle, B., Krůček, M., Krokm, G., Král, K., Luck, L., Levick, S.R., Missarov, A., Mokroš, M., Owen, H., Stereńczak, K., Pitkänen, T.P., Puletti, N., Saarinen, N., Hopkinson, C., Torresan, C., Tomelleri, E., Weiser, H., Junttila, S., and Astrup, R. (2024) Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset. ArXiv; available here

Files

dev.zip

Files (27.2 GB)

Name Size Download all
md5:ecb6196d9d630b095e6f6249b46efdd7
25.4 GB Preview Download
md5:087eaed01d45bf83643b4d7edcef33f8
1.8 GB Preview Download
md5:603c91ff58c8eab486717ffc82a1b21f
1.2 MB Preview Download

Additional details

Dates

Available
2024

Software

Repository URL
https://github.com/stefp/FOR-species
Programming language
Python
Development Status
Active