{"authors": [{"name": "Pinche, Ariane and Cl\u00e9rice, Thibault and Vlachou-Efstathiou, Malamatenia and Chagu\u00e9, Alix and Camps, Jean-Baptiste and Gille-Levenson, Matthias and Brisville-Fertin, Olivier and Fischer, Franz and Gervers, Michaels and Boutreux, Agn{\\`e}s and Manton, Avery and Gabay, Simon and O'Connor, Patricia and Haverals, Wouter and Kestemont, Mike and Vandyck, Caroline and Kiessling, Benjamin", "affiliation": "CIHAM, CNRS; ALMAnaCH, Inria; CJM, PSL-ENC; IRHT, CNRS; Ca'Foscari; U-Toronto; UNIGE"}], "summary": "CATMuS Medieval 1.5.0", "description": "CATMuS (Consistent Approach to Transcribing ManuScript) Medieval is a Kraken HTR model trained on four different languages (in descending order of importance in the dataset: Old and Middle French, Latin, Spanish (and other languages of Spain), Italian) on strictly graphematic transcriptions. No abbreviation are resolved.\\n\\nThis model is the result of the collaboration from researcher from CREMMA, GalliCorpora, HTRomance and DEEDS projects. It follows the CREMMA Guidelines (Supplemented by the CREMMA Medii Aevi) and will be consolidated under the CATMuS Medieval Guidelines in an upcoming paper.\\n\\nThe model is trained with NFD unicode normalization: each diacritic (including superscripts) are transcribed as their own characters, separatly from the `main` character.\\n\\nMetrics:\\n\\n- 3,361,410 characters\\n- 113,228 lines\\n- 1602 files (indifferently double pages or single pages)\\n- 7560 regions\\n\\nAll source datasets and papers are referenced in the related works section, all transcribers are mentioned in the collaborators section, all partner-projects members are mentioned as authors.", "accuracy": 94.3, "license": "CC-BY-4.0", "script": ["Latn"], "name": "catmus-medieval.mlmodel", "graphemes": [" ", "&", "'", "(", ")", "*", "+", ",", "-", ".", "/", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", ":", ";", "=", "?", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "[", "]", "^", "a", "b", "c", "d", "e", "f", "g", "h", "i", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "\u00ac", "\u00b0", "\u00b6", "\u00d8", "\u00e6", "\u00f7", "\u0111", "\u0127", "\u0142", "\u0167", "\u017f", "\u0180", "\u01b7", "\u0268", "\u0300", "\u0301", "\u0303", "\u0307", "\u0308", "\u030c", "\u0313", "\u0314", "\u0327", "\u0335", "\u0336", "\u033d", "\u033e", "\u0342", "\u0363", "\u0364", "\u0365", "\u0366", "\u0367", "\u0368", "\u0369", "\u036a", "\u036b", "\u036c", "\u036d", "\u036f", "\u0391", "\u0392", "\u0393", "\u0394", "\u0395", "\u0396", "\u0397", "\u0398", "\u0399", "\u039a", "\u039b", "\u039c", "\u039d", "\u039e", "\u039f", "\u03a0", "\u03a1", "\u03a3", "\u03a4", "\u03a5", "\u03a6", "\u03a7", "\u03a8", "\u03a9", "\u03b1", "\u03b2", "\u03b3", "\u03b4", "\u03b5", "\u03b7", "\u03b8", "\u03b9", "\u03bb", "\u03bc", "\u03bd", "\u03bf", "\u03c0", "\u03c1", "\u03c2", "\u03c3", "\u03c4", "\u03c5", "\u03c7", "\u03c9", "\u1d47", "\u1d48", "\u1d56", "\u1dce", "\u1dd1", "\u1ddd", "\u1de0", "\u1de4", "\u1e9c", "\u1e9e", "\u2020", "\u2038", "\u204a", "\u204b", "\u205c", "\u207f", "\u2125", "\u27e6", "\u27e7", "\ua751", "\ua753", "\ua757", "\ua758", "\ua759", "\ua75f", "\ua76b", "\ua76d", "\ua76e", "\ua76f", "\ua770", "\ua775", "\ua7a6", "\ue8b7", "\uf017", "\uf025", "\uf02b", "\uf033", "\uf038", "\uf1ac", "\uf2da"]}