PRISM-36K: A Benchmark Dataset for AI-Generated Image Attribution
Authors/Creators
Description
PRISM-36K: A Benchmark Dataset for AI-Generated Image Attribution
PRISM-36K is a benchmark dataset of 36,000 AI-generated images for model-attribution research — the task of identifying which generative model
produced a given image.
It accompanies the paper "PRISM: Phase-enhanced Radial-based Image Signature Mapping for AI-Generated Image Attribution" (Ricco, Onofri, Cima, Cresci, Di Pietro; arXiv:2509.15270).
What is in the dataset
The dataset contains 36,000 PNG images at 512 × 512 pixels, balanced across six text-to-image generators with 6,000 images per model:
- DALL-E 2 (Ramesh et al., 2022) — closed, accessed via OpenAI API
- FuseDream (Liu et al., 2021) — GAN + CLIP guidance
- PixArt-α (Chen et al., 2024) — diffusion transformer
- SANA (Xie et al., 2024) — diffusion transformer
- Stable Diffusion 1.4 (Rombach et al., 2022) — latent diffusion
- VQGAN-CLIP (Esser et al., 2021) — GAN + CLIP guidance
Each generator produces 150 images per prompt over a fixed set of 40 author-written English prompts (20 short + 20 long, paired by topic).
All images are stored in lossless PNG format to preserve frequency-domain artefacts that are critical to spectral attribution methods.
What makes this dataset useful
- Prompt-matched generations. The same 40 prompts are issued to every generator, so cross-model differences reflect generator-specific signatures rather than prompt drift.
- Architectural diversity. The six generators span GAN-based, CLIP-guided, and transformer-based diffusion families, with both open-weight and closed-API systems represented.
- Reproducible splits. 100 random prompt-level train/test splits used in the paper are shipped as
splits/splits_100.csv; one canonical "average split" (splits/average_split.json) is provided for direct reproduction of all figures and tables. - Lossless integrity. Every image ships with a SHA-256 hash in
checksums/SHA256SUMS(BSD-style, compatible withsha256sum -c) so users can verify their downloads. - Rich metadata. Per-image manifest (
metadata/images.csv) and prompt manifest (metadata/prompts.csv) support filtering by model, prompt length, prompt pair, or specific generation iteration.
Repository layout
PRISM-36K/
├── README.md
├── LICENSE.txt
├── CITATION.cff
├── CHANGELOG.md
├── metadata/
│ ├── prompts.csv
│ └── images.csv
├── splits/
│ ├── average_split.json
│ └── splits_100.csv
├── images/
│ ├── DALLE-2/
│ ├── FuseDream/
│ ├── PixArt-alpha/
│ ├── SANA/
│ ├── StableDiffusion-1.4/
│ └── VQGAN-CLIP/
└── checksums/
└── SHA256SUMS
Image filename convention: <ModelName>_<promptid>_<iter>.png, with promptid ∈ 1..40 and iter ∈ 1..150.
Intended uses
- Training and evaluating model-attribution classifiers for AI-generated images.
- Benchmarking real vs. fake detectors in a controlled multi-source setting.
- Studying frequency-domain and spectral fingerprints of generative models.
- Research on content provenance, generative-AI accountability, and related forensic problems.
Companion resources
- Paper: arXiv:2509.15270
- Image-generation scripts (the code used to produce these images): github.com/emarich/PRISM-36K
- PRISM classifier and evaluation code: released upon full paper acceptance.
Licensing
Dataset (images and metadata): Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Note on DALL-E 2 images. The 6,000 images in images/dalle2/ were generated via OpenAI's paid API and are subject to OpenAI's usage policies in addition to CC BY 4.0: users intending to use these images beyond academic research should consult OpenAI's current terms of service.
Note on NVIDIA-SANA images. The 6,000 images in images/sana/ are licensed under the Apache License 2.0 usage policies in addition to CC BY 4.0.
Citing PRISM-36K
If you use this dataset, please cite both the paper and this Zenodo record. BibTeX entries and a CFF citation file are provided in the repository (README.md, CITATION.cff).
Limitations
- Closed-set scope. The dataset covers six specific generators; it is not designed to support open-set attribution to unseen models.
- English-only prompts authored by the dataset creators; no multilingual or in-the-wild prompts are included.
- Synthetic only. No real photographs are included; for real vs. fake benchmarks, real images must be sourced from a complementary dataset.
- No identifiable individuals. Prompts were authored to elicit generic scenes (objects, animals, landscapes); the dataset contains no images of identifiable real persons by design.
Files
_teaser.png
Files
(14.2 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:3aa38b0c3ffbd1e1749e52fab5c64dfb
|
2.3 MB | Preview Download |
|
md5:78ece11704120acec24282ef5de26297
|
346 Bytes | Preview Download |
|
md5:02c7978972665e90c7c56c14ca97b7d7
|
2.1 kB | Preview Download |
|
md5:0526526a53eaa92b2b2e51c34da1da93
|
2.8 kB | Download |
|
md5:a4837fe174805875e8ad8b9adc709b24
|
10.5 kB | Download |
|
md5:e1360791bad911db4c808d9a2e5777d7
|
3.7 MB | Preview Download |
|
md5:4ab5525b06cecaa652a2eff2fa0805d2
|
14.2 GB | Preview Download |
|
md5:fb5d051e53001fdff7fec0f368f47190
|
20.8 kB | Download |
|
md5:e4c3b0a0ca641329ce666eef6b715ec1
|
2.8 kB | Preview Download |
|
md5:1d13ff65b5b7044d949a4226e23ad5a7
|
8.6 kB | Preview Download |
|
md5:75a77e2a0182bd59cc01d723333e87b7
|
3.4 MB | Download |
|
md5:f56e03405d8d84241f91987296c848a3
|
38.1 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Preprint: arXiv:2509.15270 (arXiv)
- Is supplemented by
- Software: https://github.com/emarich/PRISM-36K (URL)
- Is version of
- Dataset: 10.5281/zenodo.20038953 (DOI)
Funding
- King Abdullah University of Science and Technology
- Center of Excellence on Generative AI 5940
Dates
- Available
-
2026-05-06Zenodo publication date
- Collected
-
2025-04-10Images where generated
Software
- Repository URL
- https://github.com/emarich/PRISM-36K
- Programming language
- Python
- Development Status
- Active
References
- A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, "Hierarchical text-conditional image generation with clip latents," arXiv e-prints, pp. arXiv–2204, 2022
- Liu, Xingchao, et al. "Fusedream: Training-free text-to-image generation with improved clip+ gan space optimization." arXiv preprint arXiv:2112.01573 (2021).
- Chen, Junsong, et al. "Pixart-$\alpha $: Fast training of diffusion transformer for photorealistic text-to-image synthesis." arXiv preprint arXiv:2310.00426 (2023).
- Xie, Enze, et al. "Sana: Efficient high-resolution image synthesis with linear diffusion transformers." arXiv preprint arXiv:2410.10629 (2024).
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, "High- resolution image synthesis with latent diffusion models," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10 684–10 695.
- M. Li, R. Xu, S. Wang, L. Zhou, X. Lin, C. Zhu, M. Zeng, H. Ji, and S.-F. Chang, "Clip-event: Connecting text and images with event structures," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 420–16 429, D O I :10.1109/CVPR52688.2022.01593.