Published June 5, 2024 | Version First
Dataset Open

Train and Evaluation Code, Road Classification Models and Test set of the paper "Insights into the Effects of Image Overlap and Image Size on Semantic Segmentation Models Trained for Road Surface Area Extraction from Aerial Orthophotography"

Description

This repository contains the Python scripts built for training and evaluation of the implementation, together with the test data and the resulting road segmentation models corresponding to the paper "Insights into the Effects of Image Overlap and Image Size on Semantic Segmentation Models Trained for Road Surface Area Extraction from Aerial Orthophotography". The scripts make use of the Tensorflow with Keras framework and their additional required dependencies.

The training and validation set is based on the binary SROADEX dataset (https://zenodo.org/records/6482346) that was re-split into tiles that feature the image resolutions (256 x 256, 512 x 512, and 1024 x 1024 pixels) and image overlaps (0% and 12.5%) considered in this study. The data have been generated using scripts developed in Python using Open Source libraries (GDAL/OGR and MapScript) for rasterization of vector cartography that represents the axes of the different types of roads (urban, interurban and rural). This binary road data contains information from 16 full orthoimages (28.5 km * 18.5 km) with spatial resolution of 0.5 m/pixel from the insular and peninsular Spanish territory. Due to the size on disk of approximately 492 gigabytes, this training and validation data is only available upon request from the corresponding author. The test set has been generated from a novel area from Palencia (Spain) and features 18 million pixels labelled with the positive "Road" class. The test sets are provided in the repository for each resolution (with no overlap), so that additional DL models can be evaluated on the same data and compared with the results achieved in this study.

The structure of the information shared in this repository is as follows:
The scripts have been grouped by tile resolution (256, 512 and 1024). First, the test set and the evaluation script can be found. For each tile resolution, there are two subfolders (corresponding to the "no overlap" and "12.5% overlap"). In each case, the Python scripts for training the models in the three repetitions are shared, and the trained models (H5 format) are shared in compressed form. Finally, for each resolution we also share the testing dataset which consists of two folders.

The material is distributed under a CC-BY 4.0 license.

Files

B-segmentation.zip

Files (29.2 GB)

Name Size Download all
md5:ad89c0857e4b9803ea1b7afbac0aee46
29.2 GB Preview Download

Additional details

Funding

Ministerio de Ciencia, Innovación y Universidades
SROADEX PID2020-116448GB-I00