Published March 22, 2021 | Version v1
Other Open

Tesseract OCR models for historic prints based on Latin script

Description

frak2021-0.905.traineddata is a generic "best" model for Tesseract OCR. It can be used for training of derived models.

Although the model was trained mainly with German and Latin ground truth, it can also be used for English, French and other Western European language texts.

frak2021_0.905.traineddata is a generic "fast" model for Tesseract OCR which was derived from the "best" model and enhanced with a mostly German dictionary. It can only be used for OCR recognition where it is much faster than the "best" model, but not for training.

 

Files

Files (8.5 MB)

Name Size Download all
md5:234e8bb819042f615576bd01aa2419fd
3.4 MB Download
md5:fad5331435b9f22e52069585ad7c5394
5.1 MB Download