Tesseract OCR models for historic prints based on Latin script

doi:10.5281/zenodo.10125246

Published March 22, 2021 | Version v1

Other Open

Tesseract OCR models for historic prints based on Latin script

Weil, Stefan (Producer)

frak2021-0.905.traineddata is a generic "best" model for Tesseract OCR. It can be used for training of derived models.

Although the model was trained mainly with German and Latin ground truth, it can also be used for English, French and other Western European language texts.

frak2021_0.905.traineddata is a generic "fast" model for Tesseract OCR which was derived from the "best" model and enhanced with a mostly German dictionary. It can only be used for OCR recognition where it is much faster than the "best" model, but not for training.

Files

Files (8.5 MB)

Name	Size	Download all
frak2021-0.905.traineddata md5:234e8bb819042f615576bd01aa2419fd	3.4 MB	Download
frak2021_0.905.traineddata md5:fad5331435b9f22e52069585ad7c5394	5.1 MB	Download

Views

Downloads

Show more details

	All versions	This version
Views	84	84
Downloads	24	24
Data volume	103.4 MB	103.4 MB

More info on how stats are collected....

DOI

Resource type

Other

Publisher

Zenodo

Creative Commons Zero v1.0 Universal

CC0 waives copyright interest in a work you've created and dedicates it to the world-wide public domain. Use CC0 to opt out of copyright entirely and ensure your work has the widest reach. Read more

Technical metadata

Created: November 14, 2023
Modified: November 14, 2023

Tesseract OCR models for historic prints based on Latin script

Creators

Description

Files

Files (8.5 MB)