Dataset Open Access

ICDAR'15 SMARTPHONE DOCUMENT CAPTURE AND OCR COMPETITION (SmartDoc) - Challenge 1 (original version)

Chazalon, Joseph; Rusiñol, Marçal


Smartphones are replacing personal scanners. They are portable, connected, powerful and affordable. They are on their way to become the new entry point in business processing applications like document archival, ID scanning, check digitization, just to name a few. In order keep our workflows streamlined, we need to make those new capture device as reliable as batch scanners.

We believe an efficient capture process should be able to:

  1. detect and segment the relevant document object during the preview phase;
  2. assess the quality of the capture conditions and help the user improve them;
  3. optionally, trigger the capture at the perfect moment;
  4. and produce a high-quality, controlled output based on the high resolution captured image.

This competition is focused on the first step of this process: efficiently detect and segment document regions, as illustrated by following video showing the ideal output for the preview phase of some acquisition session: Click here to watch the video. This video shows the ideal document object detection ‎(ie the ground truth, as a red frame)‎.

For this challenge, the input consists in a set of videoclips containing a document from a predefined set, and the output should be an xml file containing the quadrilateral coordinates in which we can find the document per each frame of the video. Click here for detailed information about the dataset. 


Licence for the dataset of challenge 1 (page outline detection in preview frames) :

This work is licensed under a Creative Commons Attribution 4.0 International License <>. Author attribution should be given by citing the following conference paper: Jean-Christophe Burie, Joseph Chazalon, Mickaël Coustaty, Sébastien Eskenazi, Muhammad Muzzamil Luqman, Maroua Mehri, Nibal Nayef, Jean-Marc OGIER, Sophea Prum and Marçal Rusinol: “ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc)”, In 13th International Conference on Document Analysis and Recognition (ICDAR), 2015.

If you use this dataset, please send us a short email at <icdar.smartdoc (at)> to tell us why it was useful to you, and whether you have results or publications we can reference on our website. Thank you!

Files (1.5 GB)
Name Size
21.2 MB Download
1.5 GB Download
All versions This version
Views 2,3422,343
Downloads 1,8621,863
Data volume 1.4 TB1.4 TB
Unique views 2,1082,109
Unique downloads 1,1661,167


Cite as