Other Open Access

Generic HTR model for Old Cyrillic uncial and semi-uncial script styles (11th-16th c.)

Rabus, Achim; Thompson, Walker Riggs; Stökl Ben Ezra, Daniel

Training data consist of parts of the Russian Church Slavonic Great Reading Menology (16th century), Old Church Slavonic Codex Suprasliensis (11th century), and the 11th century manuscript of the Catecheses of Cyril of Jerusalem. This is a generic model suitable for transcribing a variety of Old Cyrillic script styles including uncial and semi-uncial. The original training set was prepared in Transkribus, whence it was exported and re-used to train this model. It is possible that the export caused some distortions of baselines or line masks, or corruptions in the data, which may have inflated CER (despite manual cleansing prior to Kraken training).

Files (16.3 MB)
Name Size
Cyr02full_best.mlmodel
md5:8ee4c48b93234211d5a7d4c7cc28937e
16.3 MB Download
metadata.json
md5:5e29292e37c9901579eaab4d0f191460
3.5 kB Download
82
27
views
downloads
All versions This version
Views 8282
Downloads 2727
Data volume 277.6 MB277.6 MB
Unique views 6868
Unique downloads 1717

Share

Cite as