Published August 19, 2024 | Version v1
Dataset Open

Synthetic Ukrainian license plate dataset for the paper "License plate images generation with diffusion models" (presented at PAIS 2024)

  • 1. ROR icon National University of Kyiv Mohyla Academy
  • 2. Cyclope.ai
  • 3. Université Paris-Est Créteil Val de Marne

Description

To address the challenge of data scarcity in Automatic License Plate Recognition and to bridge the gap in available Ukrainian license plate datasets, we are releasing a dataset of 10,000 synthetic Ukrainian vehicle license plate images. This dataset is intended to serve as a robust resource for the training and validation of LPR models. By providing diverse conditions, including variations in lighting and angles, this dataset supports the exploration of model performance in real-world scenarios, ultimately aiding the development of more accurate and resilient LPR systems.

Dataset Composition. The dataset contains 10,000 images, each with a resolution of 193 × 72 pixels. These images represent two main types of Ukrainian license plates: regular vehicles and electric motor-powered vehicles. The dataset covers a wide range of scenarios, including variations in lighting conditions, viewing angles, and regional codes, ensuring coverage of the standard license plate formats used in Ukraine between 2004 and 2021. The data is split into training, validation and test subsets comprising 8,000 images, 1,000 images, and 1,000 images correspondingly.

Note. As the data samples are synthetically generated, there may be slight inaccuracies in the representation of the intended distance between the letters and the exact color of the license plates. While these aspects have been approximated to closely resemble real-world conditions, they might not perfectly match the specifications of actual license plates.

Data Annotation. Each image in the dataset is annotated in the YOLO format, which includes precise bounding box coordinates for each character on the license plate. The filenames correspond to the license plate text, and follow the standard license plate format AB0000CD, where:

  • AB represents the regional code;
  • 0000 represents the numerical sequence;
  • CD represents the suffix, corresponding to specific Ukrainian Cyrillic letters with exact Latin visual equivalents.

Citation. Shpir, Mariia, Nadiya Shvai, and Amir Nakib. "License Plate Images Generation with Diffusion Models." In ECAI. 2024.

Files

Synthetic-Ukrainian-LP-dataset-10k.zip

Files (236.8 MB)

Name Size Download all
md5:1c4a9db36e5f0a662ca7a5a0fcb513dc
235.9 MB Preview Download
md5:586c3291dec52288ba5f7c2d4a56b82e
893.1 kB Preview Download

Additional details

Related works

Is supplement to
Conference paper: 10.3233/FAIA241053 (DOI)