Published January 21, 2021 | Version 1.0
Dataset Open

VIPPrint: A Large Scale Dataset for Colored Printed Documents Authentication and Source Linking

  • 1. University of Siena

Description

The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation and even the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we share a new dataset composed of a large number of synthetic and natural printed face images.  Such a dataset can be used with several computer vision and machine learning approaches for two tasks: pinpointing the printer source of a document and detecting printed pictures generated by deep fakes.

 

When using the dataset, don't forget to cite our paper:

 

@Article{jimaging7030050,
AUTHOR = {Ferreira, Anselmo and Nowroozi, Ehsan and Barni, Mauro},
TITLE = {VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents},
JOURNAL = {Journal of Imaging},
VOLUME = {7},
YEAR = {2021},
NUMBER = {3},
ARTICLE-NUMBER = {50},
URL = {https://www.mdpi.com/2313-433X/7/3/50},
ISSN = {2313-433X},
DOI = {10.3390/jimaging7030050}
}

 

Notes

Our dataset is already being updated and new versions will be published periodically on this platform.

Files

HowTo-Extract.txt

Files (146.7 GB)

Name Size Download all
md5:ece52cd52a38edf7c0a6883e30b28c76
677 Bytes Preview Download
md5:2d2ac91394fe57a0e24835bcd2e0aa57
10.7 GB Download
md5:f3a9d35ecff313644f59a3dfbcf59b0c
10.7 GB Download
md5:889b5aa836bee817b5bc5ce1059b21a8
10.7 GB Download
md5:ba1c69e520a93212cc58a3c6fd247307
10.7 GB Download
md5:f8edcd8dd588cea58a792a71e99ec117
10.7 GB Download
md5:078a6d30b532ed2e381502272c4612a3
10.7 GB Download
md5:26e9f18bff4f4cb77b5c2662945af40e
10.7 GB Download
md5:1d36d14b895eccbacc31f4a099d57afe
10.7 GB Download
md5:fc3da0f37f819ea95d1f86989d6267c1
10.7 GB Download
md5:443a035287ed797524b3995b265e0306
10.7 GB Download
md5:b808010dd56e536fc786421b74cb862f
10.7 GB Download
md5:b4acded0c80e159d392d8cbe819ee6b9
10.7 GB Download
md5:fc98f082c3071012af669e21e9f795e9
10.7 GB Download
md5:489cc72e69aff5554b1ce920f9cde6ff
7.1 GB Download

Additional details

Funding

MSCA 2018 – Marie Skłodowska-Curie Actions - Beyond 2020 811208
European Commission