EPARCHOS - Historical Greek handwritten document dataset
- 1. Democritus University of Thrace, Department of Electrical and Computer Engineering, 67100 Xanthi, Greece
Description
The dataset originates from a Greek handwritten codex that dates from around 1500-1530. This is the subset of the codex British Museum Addit. 6791, written by two hands, one by Antonius Eparchos and the other by Camillos Zanettus (ff. 104r-174v) and delivers texts by Hierocles (In Aureum carmen), Matthaeus Blastares (Collectio alphabetica) and, notably, texts by Michael Psellos (De omnifaria doctrina). The writing delivers the most important abbreviations, logograms and conjunctions, which are cited in virtually every Greek minuscule handwritten codex from the years of the manuscript transliteration and the prevalence of the minuscule script (9th century) to the post-Byzantine years. This dataset consists of 120 scanned handwritten text pages, containing 9285 lines of text, 18809 words (6787 unique words). For each page, a PageXML is provided containing the following groundtruth:
- Text region polygon coordinates
- Text line polygon coordinates with the corresponding transcription text
- Word polygon coordinated with the corresponding transcription text
Files
eparchos.zip
Files
(114.7 MB)
Name | Size | Download all |
---|---|---|
md5:e172aa8b4017436e37cf25991b5bf8b8
|
114.7 MB | Preview Download |