SegmOnto
Authors/Creators
Contributors
Data collector (2):
Project manager:
Description
SegmOnto
Layout analysis model trained with YALTAi, relying on YOLO models, and Kraken. Data are annotated with the SegmOnto controlled vocabulary. Most ot the training data are French texts, mainly prints but not only, produced by the Gallic(orpor)a, the FoNDUE and the SETAF projects.
If you need to quote the paper:
@inproceedings{solfrini_OCR_2024, author={solfrini, Sonia and Gabay, Simon and Pinche, Ariane and Beaulnes, Pierre-Olivier and Marques Oliveira, Aurélia and Gross, Geneviève and Solfaroli Camillocci, Daniela}, title={Océriser les imprimés du XVIe siècle en langue française : le cas d'un corpus romand en caractères gothiques}, address={Meknes, Morocco}, year={2024}, month={May}, booktitle={Humanistica 2024}, publisher={Association francophone des humanités numériques} }
Files
Files
(200.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:addef2cf2f746795850d058ec4d16081
|
5.1 MB | Download |
|
md5:cdb9f447aa9b11989a7f30e08538c830
|
52.1 MB | Download |
|
md5:92b9d17dc3ce75b08f5682540b41fc5d
|
6.3 MB | Download |
|
md5:bb13144d1e7e7d57b7579bb26ff94210
|
136.8 MB | Download |
Additional details
Related works
- Is documented by
- Journal article: https://hal.science/hal-04343404 (URL)
- Conference paper: https://hal.science/HUMANISTICA-2024/hal-04555002v1 (URL)