Published April 21, 2026 | Version v1
Dataset Open

Dataset B of Paper "Hidden in Plain Signs: Realistic Sticker Attacks on Production Traffic Sign Recognition Systems"

Authors/Creators

Description

⚠️ Academic Use Only — Non-Commercial Research Dataset This dataset is a derivative work assembled exclusively for academic research purposes. It inherits Non-Commercial (NC) restrictions from its source datasets. See the License section for full details.

Overview

This dataset was compiled for academic research on traffic sign detection and recognition. It aggregates and preprocesses images from five publicly available benchmark datasets, applying cropping and resizing transformations to meet the input requirements of the deep learning models evaluated in our work.

Key facts:

  • Purpose: Academic research only (non-commercial)
  • Task: Traffic sign detection / recognition
  • Images: 30,226
  • Classes: 17
  • Splits: Train / Validation / Test
  • Image format: JPEG

Dataset Composition

This dataset is a derivative work combining images from the following source datasets. Each subset retains the license of its original source.

# Source Dataset Images Used Classes Used
1 Mapillary Traffic Sign Dataset (MTSD) 13,662 Limits 10-120 km/h, Stop, No Vehicles, No Stopping, No Parking, No Entry
2 DFG Traffic Sign Dataset 2,360 Limits 10, 30-70 km/h, Stop, No Vehicles, No Stopping, No Parking, No Entry
3 TT100k 6,605 Limits 10-120 km/h, No Vehicles, No Stopping, No Parking, No Entry
4 GTSDB (German Traffic Sign Detection Benchmark) 357 Limits 20, 30, 50-80, 100, 120 km/h, Stop, No Entry
5 ItalianSigns 361 all (Limits 20-90 km/h)
Due to the preprocessing phase (zooming/cropping), some images were split into multiple patches. The total number of images in this dataset is therefore greater than the sum of the "Images Used" column above.

Dataset Structure

dataset/
├── LICENSE.txt
├── scripts/
│   └── adapt_sources         # Pipelines to extract data from original sources, one script per dataset
│       └── Mapillary.py
│       └── DFG.py
│       └── TT100k.py
│       └── GTSDB.py
│       └── ItalianSigns.py
│ └── zoom_and_crop.py # Script to be applied after download of the sources, to zoom images with very small objects ├── images/ │ ├── train/ │ └── test/
│ └── val/ ├── labels/ # One .txt file per image, with the annotations in YOLO format. │ ├── train/ │ └── test/
│ └── val/

Label format: YOLO .txt

Preprocessing

Once downloaded the official sources, images were processed using the scripts available in the /scripts folder of this repository, to extract the relevant images for our purposes and meet model input requirements.

In particular:

  1. Images extraction: The scripts in /scripts/adapt_sources have been run, one for each dataset. They select only the relevant images, and convert their annotations using our labeling and the YOLO format. The images are renamed using the name of the source dataset with an incremental numeric suffix (e.g., `DFG (0).jpg`, `DFG (1).jpg`). 
  2. Cropping and zooming: The script /scripts/zoom_and_crop.py has been run on the images obtained after step 1. It zooms in on the images with very small traffic signs, to better isolate the sign region. When an image is split into multiple patches during such process, an incremental numeric suffix is appended to the obtained sub-images  (e.g., `DFG (0)_0.jpg`, `DFG (0)_1.jpg`).

Finally, the dataset is split into training (70%), testing (15%) and validation (15%) images.

License

This dataset is a derivative work and is released under CC BY-NC-SA 4.0 (Creative Commons Attribution – NonCommercial – ShareAlike 4.0 International), which is the most restrictive license among those of the contributing source datasets that require ShareAlike terms.

In summary, you are free to:

  • Share — copy and redistribute this dataset in any medium or format
  • Adapt — preprocess, crop, or otherwise transform the material

Under the following terms:

  • Attribution (BY) — You must give appropriate credit to all original source datasets (see Citations).
  • NonCommercial (NC) — You may not use this dataset for commercial purposes.
  • ShareAlike (SA) — If you build upon this dataset, you must distribute your derivative work under the same CC BY-NC-SA 4.0 license.

Per-source license summary

Source License License Link
Mapillary TSD CC BY-NC-SA 4.0 https://creativecommons.org/licenses/by-nc-sa/4.0/
DFG Traffic Sign Dataset CC BY-NC-SA 4.0 https://creativecommons.org/licenses/by-nc-sa/4.0/
TT100k CC BY-NC 4.0 https://creativecommons.org/licenses/by-nc/4.0/
GTSDB No explicit license stated — research use only per original authors https://benchmark.ini.rub.de/gtsdb_dataset.html
ItalianSigns GNU LGPLv3 https://www.gnu.org/licenses/lgpl-3.0.html

Disclaimer: The authors of this compiled dataset are not lawyers and this summary does not constitute legal advice. Users are responsible for verifying compliance with each source dataset's license before use.

Note on GTSDB licensing: The GTSDB was published by the Institut für Algorithmen und Kognitive Systeme (KIT). Please refer to the original dataset page for the authoritative license terms before reusing this subset.

Citations

If you use this dataset in your research, please cite all original source datasets listed below.

Source dataset citations

1. Mapillary Traffic Sign Dataset (MTSD)

@inproceedings{ertler2020mapillary,
  title={The mapillary traffic sign dataset for detection and classification on a global scale},
  author={Ertler, Christian and Mislej, Jerneja and Ollmann, Tobias and Porzi, Lorenzo and Neuhold, Gerhard and Kuang, Yubin},
  booktitle={European conference on computer vision},
  pages={68--84},
  year={2020},
  organization={Springer}
}

2. DFG Traffic Sign Dataset

@article{Tabernik2019ITS,
    author = {Tabernik, Domen and Sko{\v{c}}aj, Danijel},
    journal = {IEEE Transactions on Intelligent Transportation Systems},
    title = {{Deep Learning for Large-Scale Traffic-Sign Detection and Recognition}},
    year = {2019},
    doi={10.1109/TITS.2019.2913588}, 
    ISSN={1524-9050}
}

3. TT100k

@InProceedings{Zhe_2016_CVPR,
  title     = {Traffic-Sign Detection and Classification in the Wild},
  author    = {Zhu, Zhe and Liang, Dun and Zhang, Songhai and Huang, Xiaolei
               and Li, Baoli and Hu, Shimin},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2016}
}

4. GTSDB (German Traffic Sign Detection Benchmark)

@inproceedings{houben2013gtsdb,
  title     = {Detection of Traffic Signs in Real-World Images: The {German Traffic
               Sign Detection Benchmark}},
  author    = {Houben, Sebastian and Stallkamp, Johannes and Salmen, Jan and
               Schlipsing, Marc and Igel, Christian},
  booktitle = {International Joint Conference on Neural Networks (IJCNN)},
  year      = {2013},
  doi       = {10.1109/IJCNN.2013.6706807}
}

5. ItalianSigns

@misc{ItalianSigns,
  author={Daniel Rossi and Riccardo Salami},
  title={ItalianSigns},
  year={2022},
  howpublished={\url{https://www.kaggle.com/datasets/officialprojecto/italiansigns}}
}

Acknowledgements

We thank the authors and institutions that made their datasets publicly available for research purposes:

  • The Mapillary team for the MTSD dataset.
  • Domen Tabernik and Danijel Skočaj for the DFG Traffic Sign Dataset.
  • Zhe Zhu et al. for the TT100k dataset.
  • Sebastian Houben et al. for the GTSDB.
  • Daniel Rossi and Riccardo Salami for the ItalianSigns traffic sign dataset.

Contact

This dataset was submitted anonymously for peer review. After the review process, author information and the associated paper reference will be added here.

For questions regarding licensing and reuse, please open an issue on this repository or contact the corresponding author after de-anonymization.

Last updated: 2026 Dataset version: 1.0

Files

images.zip

Files (6.3 GB)

Name Size Download all
md5:787a33c50bf10e99648bede5a178fed1
6.3 GB Preview Download
md5:7ec550e28b97192e79dccd0ad3504d60
7.6 MB Preview Download
md5:ec38f392e76c08f1b8410d129b6e554e
384 Bytes Preview Download
md5:444363a00704e8d95c1444ca2b60c96b
8.0 kB Preview Download

Additional details

References

  • Christian Ertler, Jerneja Mislej, Tobias Ollmann, Lorenzo Porzi, Ger- hard Neuhold, and Yubin Kuang. The mapillary traffic sign dataset for detection and classification on a global scale. In European conference on computer vision, pages 68–84. Springer, 2020.
  • Domen Tabernik and Danijel Skoˇcaj. Deep Learning for Large- Scale Traffic-Sign Detection and Recognition. IEEE Transactions on Intelligent Transportation Systems, 2019.
  • Zhe Zhu, Dun Liang, Songhai Zhang, Xiaolei Huang, Baoli Li, and Shimin Hu. Traffic-sign detection and classification in the wild. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  • Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. Detection of Traffic Signs in Real-World Images: The German Traffic Sign Detection Benchmark.International Joint Conference on Neural Networks (IJCNN 2013), pp. 715-722, IEEE Press
  • Daniel Rossi and Riccardo Salami. Italiansigns. https://www.kaggle. com/datasets/officialprojecto/italiansigns, 2022.