Input and output data (images + boulder labels, model setup, model weights and more) for the manuscript "Automatic characterization of boulders on planetary surfaces from high-resolution satellite images"
Creators
- 1. Stanford University, University of Oslo
- 2. Stanford University
- 3. Ponoma University
- 4. Arizona State University
- 5. Medvedev Consulting
- 6. Technion Israel Institute of Technology, Stanford University
- 7. University of Oslo
- 8. School of Atmospheric Sciences Sun Yat-Sen University
- 9. Volcanic Basin Energy Research, A.P. Karpinsky Russian Geological Research Institute Saint Petersburg, University of Oslo
Description
File 1: raw_data_BOULDERING.zip
Size: 8.8 GB
Summary: It contains all of the rasters (planetary images) and labeled boulders (raw data):
-
a boulder-mapping file, which is the manually digitized outline of boulders.
-
a ROM file (stands for Region of Mapping), which depicts the image patches on which the boulder mapping has been conducted.
-
a global-tiles file, which shows all of the image patches within a raster.
There are multiple locations/images per planetary body.
Structure:
. └── raw_data/ ├── earth/ │ └── image_name/ │ ├── shp/ │ │ ├── <image_name>-ROM.shp │ │ ├── <image_name>-boulder-mapping.shp │ │ └── <image_name>-global-tiles.shp │ └── raster/ │ └── <image_name>.tif ├── mars/ │ └── image_name/ │ ├── shp/ │ │ ├── <image_name>-ROM.shp │ │ ├── <image_name>-boulder-mapping.shp │ │ └── <image_name>-global-tiles.shp │ └── raster/ │ └── <image_name>.tif └── moon/ └── image_name/ ├── shp/ │ ├── <image_name>-ROM.shp │ ├── <image_name>-boulder-mapping.shp │ └── <image_name>-global-tiles.shp └── raster/ └── <image_name>.tif
File 2: best_model.zip
Size: 624.7 MB
Summary:
This zip file contains all of the inputs and outputs required/obtained from the training of the BoulderNet Mask R-CNN model (model setup, augmentation pipeline, model weights, log during training, logged metrics):
-
augmentation_pipeline.json (required as inputs for the training of the algorithm to apply augmentations). See https://github.com/astroNils and the MLtools repository for more information.
-
Base-RCNN-FPN.yaml (base model setup file).
-
config.yaml (complete model setup file, merge of the base and Mars-Moon-Earth setup file).
-
Mars-MoonEarth-v050...yaml (model setup file).
-
log.txt (log during training of the algorithm).
-
model_0055999.pth (model weights at second last saving step)
-
model_0063999.pth (model weights at last saving step)
We advice the use of model weights model_0055999.pth (to avoid slight overfitting).
File 3: Apr2023-Mars-Moon-Earth-mask-5px.zip (pre-processed input images)
Size: 252.8 MB
Summary:
This zip files contains the input data (images and boulder outlines) for the train, validation and test datasets. See https://github.com/astroNils and the MLtools repository for more information in how-to-use the different files.
-
The json folder contains json files that can be given as input (as a custom dataset) to the Detectron2 platform. The only differences between the two files is how the bounding boxes around masks have been generated. We advised to use "Apr2023-Mars-Moon-Earth-mask-5px.json".
-
The pkl folder and pickle file includes some informations about the 950 image patches in our boulder dataset.
-
The pre-processing folder contains all of the training, validation and test image patches and corresponding shapefiles.
-
The shapefile folder is actually empty (it should not be there!).
Structure:
. └── preprocessed_inputs/ ├── json ├── pkl ├── preprocessing/ │ ├── train/ │ │ ├── images │ │ └── labels │ ├── validation/ │ │ ├── images │ │ └── labels │ └── test/ │ ├── images │ └── labels └── shp