Published November 2, 2023 | Version v1
Dataset Open

Synthbuster: Towards Detection of Diffusion Model Generated Images

  • 1. ROR icon University of Paris-Saclay
  • 2. ROR icon École Normale Supérieure Paris-Saclay
  • 3. ROR icon French National Centre for Scientific Research

Contributors

Contact person:

  • 1. ROR icon University of Paris-Saclay
  • 2. ROR icon École Normale Supérieure Paris-Saclay
  • 3. ROR icon French National Centre for Scientific Research

Description

Dataset described in the paper "Synthbuster: Towards Detection of Diffusion Model Generated Images" (Quentin Bammey, 2023, Open Journal of Signal Processing)

This dataset contains synthetic, AI-generated images from 9 different models:

  • DALL·E 2
  • DALL·E 3
  • Adobe Firefly
  • Midjourney v5
  • Stable Diffusion 1.3
  • Stable Diffusion 1.4
  • Stable Diffusion 2
  • Stable Diffusion XL
  • Glide

 

1000 images were generated per model. The images are loosely based on raise-1k images (Dang-Nguyen, Duc-Tien, et al. "Raise: A raw images dataset for digital image forensics." Proceedings of the 6th ACM multimedia systems conference. 2015.). For each image of the raise-1k dataset, a description was generated using the Midjourney /describe function and CLIP interrogator (https://github.com/pharmapsychotic/clip-interrogator/). Each of these prompts was manually edited to produce results as photorealistic as possible and remove living persons and artists names.

 

In addition to this, parameters were randomly selected within reasonable values for methods requiring so.

The prompts and parameters used for each method can be found in the `prompts.csv` file.

 

This dataset can be used to evaluate AI-generated image detection methods. We recommend matching the generated images with the real Raise-1k images, to evaluate whether the methods can distinguish the two of them. Raise-1k images are not included in the dataset, they can be downloaded separately at (http://loki.disi.unitn.it/RAISE/download.html).

 

None of the images suffered degradations such as JPEG compression or resampling, which leaves room to add your own degradations to test robustness to various transformation in a controlled manner.

 

Files

synthbuster.zip

Files (12.4 GB)

Name Size Download all
md5:0695bd328e16ea21c5c9cc2ae1d994ff
12.4 GB Preview Download

Additional details

Related works

Is supplement to
Dataset: 10.5281/zenodo.10066047 (DOI)

Funding

APATE – A Prototype deepfake Assessment Toolbox for forensic Experts ANR-22-CE39-0016
Agence Nationale de la Recherche
vera.ai – vera.ai: VERification Assisted by Artificial Intelligence 101070093
European Commission

Dates

Submitted
2023-09-06
Accepted
2023-10-08