UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Other Open Access

Generic HTR model for Old Cyrillic uncial and semi-uncial script styles (11th-16th c.)

Rabus, Achim; Thompson, Walker Riggs; Stökl Ben Ezra, Daniel


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">chu</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Church Slavic</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Cyrillic</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Uncial</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Semi-uncial</subfield>
  </datafield>
  <controlfield tag="005">20230324143542.0</controlfield>
  <controlfield tag="001">7755483</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Heidelberg University</subfield>
    <subfield code="0">(orcid)0000-0002-7203-9508</subfield>
    <subfield code="a">Thompson, Walker Riggs</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">École pratique des hautes études</subfield>
    <subfield code="0">(orcid)0000-0001-5668-493X</subfield>
    <subfield code="a">Stökl Ben Ezra, Daniel</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">16327306</subfield>
    <subfield code="z">md5:8ee4c48b93234211d5a7d4c7cc28937e</subfield>
    <subfield code="u">https://zenodo.org/record/7755483/files/Cyr02full_best.mlmodel</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">3456</subfield>
    <subfield code="z">md5:5e29292e37c9901579eaab4d0f191460</subfield>
    <subfield code="u">https://zenodo.org/record/7755483/files/metadata.json</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2023-03-21</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">user-ocr_models</subfield>
    <subfield code="o">oai:zenodo.org:7755483</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Freiburg</subfield>
    <subfield code="0">(orcid)0000-0002-5366-1430</subfield>
    <subfield code="a">Rabus, Achim</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Generic HTR model for Old Cyrillic uncial and semi-uncial script styles (11th-16th c.)</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-ocr_models</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/2.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 2.0 Generic</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Training data consist of parts of the Russian Church Slavonic Great Reading Menology (16th century), Old Church Slavonic Codex Suprasliensis (11th century), and the 11th century manuscript of the Catecheses of Cyril of Jerusalem. This is a generic model suitable for transcribing a variety of Old Cyrillic script styles including uncial and semi-uncial. The original training set was prepared in Transkribus, whence it was exported and re-used to train this model. It is possible that the export caused some distortions of baselines or line masks, or corruptions in the data, which may have inflated CER (despite manual cleansing prior to Kraken training).&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.7755482</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.7755483</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">other</subfield>
  </datafield>
</record>
111
33
views
downloads
All versions This version
Views 111111
Downloads 3333
Data volume 342.9 MB342.9 MB
Unique views 9494
Unique downloads 2222

Share

Cite as