Published June 4, 2024 | Version V1.0
Dataset Open

StreetSurfaceVis: a dataset of street-level imagery with annotations of road surface type and quality

  • 1. ROR icon HTW Berlin - University of Applied Sciences
  • 2. Hochschule für Technik und Wirtschaft Berlin

Contributors

Researcher:

  • 1. ROR icon HTW Berlin - University of Applied Sciences

Description

StreetSurfaceVis

StreetSurfaceVis is an image dataset containing 9,122 street-level images from Germany with labels on road surface type and quality. The CSV file streetSurfaceVis_v1_0.csv contains all image metadata and four folders contain the image files. All images are available in four different sizes, based on the image width, in 256px, 1024px, 2048px and the original size.
Folders containing the images are named according to the respective image size. Image files are named based on the mapillary_image_id.

You can find the corresponding publication here:  StreetSurfaceVis: a dataset of crowdsourced street-level imagery with semi-automated annotations of road surface type and quality

 

Image metadata

Each CSV record contains information about one street-level image with the following attributes:

  • mapillary_image_id: ID provided by Mapillary (see information below on Mapillary)
  • user_id: Mapillary user ID of contributor
  • user_name: Mapillary user name of contributor
  • captured_at: timestamp, capture time of image
  • longitude, latitude: location the image was taken at
  • train: Suggestion to split train and test data. `True` for train data and `False` for test data. Test data contains data from 5 cities which are excluded in the training data.
  • surface_type: Surface type of the road in the focal area (the center of the lower image half) of the image. Possible values: asphalt, concrete, paving_stones, sett, unpaved
  • surface_quality: Surface quality of the road in the focal area of the image. Possible values: (1) excellent, (2) good, (3) intermediate, (4) bad, (5) very bad (see the attached Labeling Guide document for details)

 

Image source

Images are obtained from Mapillary, a crowd-sourcing plattform for street-level imagery. More metadata about each image can be obtained via the Mapillary API . User-generated images are shared by Mapillary under the CC-BY-SA License.

For each image, the dataset contains the mapillary_image_id and user_name
You can access user information on the Mapillary website by https://www.mapillary.com/app/user/<USER_NAME> 
and image information by https://www.mapillary.com/app/?focus=photo&pKey=<MAPILLARY_IMAGE_ID>

If you use the provided images, please adhere to the terms of use of Mapillary.

 

Instances per class

Total number of images: 9,122

  excellent good intermediate bad very bad
asphalt 971 1697 821 246 -
concrete 314 350 250 58 -
paving stones 385 1063 519 70 -
sett - 129 694 540 -
unpaved - - 326 387 303

 

For modeling, we recommend using a train-test split where the test data includes geospatially distinct areas, thereby ensuring the model's ability to generalize to unseen regions is tested. We propose five cities varying in population size and from different regions in Germany for testing - images are tagged accordingly.

Number of test images (train-test split): 776

Inter-rater-reliablility

Three annotators labeled the dataset, such that each image was annotated by one person. Annotators were encouraged to consult each other for a second opinion when uncertain.
1,800 images were annotated by all three annotators, resulting in a Krippendorff's alpha of 0.96 for surface type and 0.74 for surface quality.

Recommended image preprocessing

As the focal road located in the bottom center of the street-level image is labeled, it is recommended to crop images to their lower and middle half prior using for classification tasks.

This is an exemplary code for recommended image preprocessing in Python:

from PIL import Image
img = Image.open(image_path)
width, height = img.size
img_cropped = img.crop((0.25 * width, 0.5 * height, 0.75 * width, height))


License

CC-BY-SA

 

Citation

If you use this dataset, please cite as: 

 

Kapp, A., Hoffmann, E., Weigmann, E. et al. StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality. Sci Data 12, 92 (2025). https://doi.org/10.1038/s41597-024-04295-9

 

@article{kapp_streetsurfacevis_2025,
    title = {{StreetSurfaceVis}: a dataset of crowdsourced street-level imagery annotated by road surface type and quality},
    volume = {12},
    issn = {2052-4463},
    url = {https://doi.org/10.1038/s41597-024-04295-9},
    doi = {10.1038/s41597-024-04295-9},
    pages = {92},
    number = {1},
    journaltitle = {Scientific Data},
    shortjournal = {Scientific Data},
    author = {Kapp, Alexandra and Hoffmann, Edith and Weigmann, Esther and Mihaljević, Helena},
    date = {2025-01-16},
}

 

-----------------------------------------------------------------------------------------------------------------------------------------------------------

This is part of the SurfaceAI project at the University of Applied Sciences, HTW Berlin.


- Prof. Dr. Helena Mihajlević
- Alexandra Kapp
- Edith Hoffmann
- Esther Weigmann

Contact: surface-ai@htw-berlin.de

https://surfaceai.github.io/surfaceai/

Funding: SurfaceAI is a mFund project funded by the Federal Ministry for Digital and Transportation Germany.

 

Files

dataset_description.pdf

Files (24.4 GB)

Name Size Download all
md5:debc3d9730151b5f6fec88d39a850b0e
621.2 kB Preview Download
md5:65eee65fce122a56df501b5aeaa86b71
118.3 kB Preview Download
md5:598da7ba7e93fd995ff30a3de2ac03b0
2.2 MB Preview Download
md5:3a32ab4f5f2992a267c4274179717c77
1.3 GB Preview Download
md5:909b5cb41de03b74f8e09a0c714b9af3
4.5 GB Preview Download
md5:d2beab4af8f3995bc8789ba27645ddc0
99.1 MB Preview Download
md5:f2d5a490702b3024f1e4c81080b9c0e1
18.5 GB Preview Download
md5:a6ac6ab1edf7d813ebe7ffb22b922423
967.2 kB Preview Download

Additional details

Related works

Is published in
Journal article: 10.1038/s41597-024-04295-9 (DOI)

Funding

Federal Ministry of Transport and Digital Infrastructure
mFund 19F1165A