Published December 6, 2023 | Version v1
Dataset Open

Vincent van Gogh Authentication Dataset

  • 1. ROR icon University of Liverpool
  • 2. Art Recognition AG


Vincent van Gogh Authentication Dataset


This dataset is a collection of artworks for research in art history, digital humanities, and computational forgery detection. The dataset is based on the **VGDB-2016** collection available here and includes a diverse range of images related to Vincent van Gogh's oeuvre.

The VGDB-2016 Dataset

- Source: The artworks were primarily sourced from Wikimedia Commons.

- Composition: The dataset comprises 126 original artworks, and 212 artworks with similar chronology or artistic movement to van Gogh.

- Image Quality: Each artwork maintains a high-resolution standard with a density of at least 196.3 Pixels Per Image (PPI).

- Special Inclusions: Two artworks with debated attribution are included for testing purposes.

The Contrast Set

To aid in the study of art forgery and style analysis, we have included a carefully curated contrast set.

- Purpose: The contrast set features artworks that are not created by van Gogh but closely resemble his style. This is crucial for developing and testing algorithms for forgery detection.

- Components:

   - Similar: These are works by artists who shared a similar style or were part of the same artistic movement as van Gogh (already part of VGDB-2016).

   - Forgeries: This category includes non-autograph copies, artworks explicitly made in the style of van Gogh, and known forgeries.

   - Synthetic Fakes: Generated using advanced AI models, including Stable Diffusion 2.1 and StyleGAN3.

- Details of the Contrast Set:

   - Artworks by Similar Artists: 212 proxies.

   - Forgeries: 17 imitations, of which:

       - 9 by Otto Wacker (in folder Vincent Forgeries Wacker),

       - 8 by John Myatt (NOT CREATIVE COMMONS).

- AI-Generated Artworks:

   - Stable Diffusion 30 images (in folder Vincent Stable Diffusion),

   - GANs fine-tuned on van Gogh's style 30 images (in folder Vincent GAN finetune),

   - Random GANs (Raw GANs) 30 images (in folder Vincent GAN random).

The metadata file ('van_gogh_forgeries.csv') is meant to extend the already existing metadata [vgdb_2016.csv] with the information of the forgeries included.

Usage Guidelines

This dataset is intended for academic and research purposes. Users are encouraged to apply this dataset in studies related to art history, digital humanities, and the development of computational tools for art analysis and forgery detection.

John Myatt Genuine Forgeries

We use 8 artworks by John Myatt in our dataset. These images are property of the Genuine Fakes Ltd. company and were downloaded manually from For reproducibility we indicate the names of the artworks used:

- John Myatt's version of Van Gogh's

   - Self Portrait

   - Starry Night with Snow and Distant Woodland

   - Starry Night with Wheat Field and Cypress Trees

   - Starry Night

   - Country Road in Provence by Night

   - A Pair of Old Shoes

   - The Harvest

   - Oleanders


We extend our gratitude to the contributors of VGDB-2016 and Wikimedia Commons for providing the foundational resources for this dataset.


Folego G, Gomes O, Rocha A. From Impressionism to Expressionism: Automatically Identifying Van Gogh's Paintings. In: 2016 IEEE International Conference on Image Processing (ICIP); 2016. p. 141–145


Files (76.3 MB)

Name Size Download all
76.3 MB Preview Download