Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Dataset Open Access

NewsEye / READ AS training dataset from Finnish Newspapers (19th C.)

Muehlberger, Günter; Hackl, Günter

The dataset comprises finnish newspaper pages from 19th century with carefully annotated text. The page images were provided by the National Library Finland (NLF) and comprise 200 pages (training set). The data are formed according to the PAGE format (cf. Cf. https://github.com/PRImA-Research-Lab/PAGE-XML/) and were produced with the Transkribus platform with support of the NewsEye and the READ project. The guidelines with which the AS GT was created are uploaded here as well.

Files (1.0 GB)
Name Size
Article GT guidelines for Newseye.pdf
md5:fe01d7a05da416972ac784307079caaa
75.1 kB Download
AS_TrainingSet_NLF_NewsEye_v2.zip
md5:66905b5c0a43a5d82b2a64b5216bbcb6
1.0 GB Download
248
40
views
downloads
All versions This version
Views 248123
Downloads 4018
Data volume 17.2 GB8.1 GB
Unique views 213109
Unique downloads 2913

Share

Cite as