Published July 15, 2020 | Version 1.0.0
Dataset Open

Low resolution scanned text dataset for optical character recognition

  • 1. DAMTP, University of Cambridge

Description

A collection of scanned pages of English text designed for testing low resolution OCR systems.  There are 11 different pieces of text, each of which contains 5 pages of text.  Each of these 55 pages is typeset in 18 different fonts and then scanned at 300 dpi, producing a total of 990 pages of scanned text.  Downsampled 60 dpi and 75 dpi versions are included.

Files

Files (1.8 GB)

Name Size Download all
md5:8ecb666d8c724712b3f98baf911f5894
1.8 GB Download