Other Open Access
Rabus, Achim;
Thompson, Walker Riggs;
Stökl Ben Ezra, Daniel
<?xml version='1.0' encoding='utf-8'?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd"> <identifier identifierType="DOI">10.5281/zenodo.7755483</identifier> <creators> <creator> <creatorName>Rabus, Achim</creatorName> <givenName>Achim</givenName> <familyName>Rabus</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-5366-1430</nameIdentifier> <affiliation>University of Freiburg</affiliation> </creator> <creator> <creatorName>Thompson, Walker Riggs</creatorName> <givenName>Walker Riggs</givenName> <familyName>Thompson</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-7203-9508</nameIdentifier> <affiliation>Heidelberg University</affiliation> </creator> <creator> <creatorName>Stökl Ben Ezra, Daniel</creatorName> <givenName>Daniel</givenName> <familyName>Stökl Ben Ezra</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-5668-493X</nameIdentifier> <affiliation>École pratique des hautes études</affiliation> </creator> </creators> <titles> <title>Generic HTR model for Old Cyrillic uncial and semi-uncial script styles (11th-16th c.)</title> </titles> <publisher>Zenodo</publisher> <publicationYear>2023</publicationYear> <subjects> <subject>Church Slavic</subject> <subject>Cyrillic</subject> <subject>Uncial</subject> <subject>Semi-uncial</subject> </subjects> <dates> <date dateType="Issued">2023-03-21</date> </dates> <language>cu</language> <resourceType resourceTypeGeneral="Other"/> <alternateIdentifiers> <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/7755483</alternateIdentifier> </alternateIdentifiers> <relatedIdentifiers> <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.7755482</relatedIdentifier> <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/ocr_models</relatedIdentifier> </relatedIdentifiers> <version>1.0</version> <rightsList> <rights rightsURI="https://creativecommons.org/licenses/by/2.0/legalcode">Creative Commons Attribution 2.0 Generic</rights> <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights> </rightsList> <descriptions> <description descriptionType="Abstract"><p>Training data consist of parts of the Russian Church Slavonic Great Reading Menology (16th century), Old Church Slavonic Codex Suprasliensis (11th century), and the 11th century manuscript of the Catecheses of Cyril of Jerusalem. This is a generic model suitable for transcribing a variety of Old Cyrillic script styles including uncial and semi-uncial. The original training set was prepared in Transkribus, whence it was exported and re-used to train this model. It is possible that the export caused some distortions of baselines or line masks, or corruptions in the data, which may have inflated CER (despite manual cleansing prior to Kraken training).</p></description> </descriptions> </resource>
All versions | This version | |
---|---|---|
Views | 111 | 111 |
Downloads | 33 | 33 |
Data volume | 342.9 MB | 342.9 MB |
Unique views | 94 | 94 |
Unique downloads | 22 | 22 |