A multi-scale labeled dataset for boulder segmentation and navigation on small bodies
Description
The capability to detect boulders on the surface of small bodies is beneficial for vision-based applications such as hazard detection during critical operations, safety quantification, autonomous planning of scientific operations, and autonomous navigation. This task, however, is challenging due to the wide assortment of irregular shapes, the characteristics of the boulders population, and the rapid variability in the illumination conditions. Moreover, the lack of publicly available labeled datasets damps the research about data-driven algorithms. The following dataset has been designed and made publicly available to tackle these challenges. Its purpose is twofold. First, from the lessons learned from previous datasets, to develop a multi-purpose, high-fidelity dataset with boulders scattered across the surface of a small body. Second, to exploit domain randomization, artificial noise addition, scaling, and post-processing, enabling the design of data-driven pipelines.
The methodology used to generate the dataset is illustrated in the work "A multi-scale labeled dataset for boulder segmentation and navigation on small bodies" by Mattia Pugliatti and Michele Maestrini, presented at the 74th IAC (International Astronautical Congress), 2024, Baku, Azerbaijan.
The dataset contains the image-label pairs of 47502 samples, organized with the following structure:
Dataset_PugliattiMaestrini_2023IAC
--img
--labels
--masks
The dataset is comprised of 47502 samples. The "img" folder contains the input, 512x 512 grayscale images. The "labels" folder includes the .txt segmentation labels of the 15 most prominent boulders for each image detected with the methodology illustrated in the IAC paper. The "masks" dataset contains the segmentation masks for all image layers, with the values being encoded between 0 and 17 as uint8. The samples are named as XXXXXX_YYY. XXXXXX stands for the image's original ID during rendering. YYY corresponds to the sub-splits of the original image obtained at rendering:
001 - Top-Left crop
002 - Top-Right crop
003 - Bottom-Left crop
004 - Bottom-right crop
005 - Whole, resized
The file "10000_ub_2023-01-18 00.09.43.txt" contains all the values of the rendering inputs detailed in the IAC paper.
Files
Dataset_PugliattiMaestrini_2023IAC.zip
Files
(7.3 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:5da7960e8ce1dde35b42241d7cedeb83
|
7.3 GB | Preview Download |