Journal article Open Access
Tsimpiris, Alkiviadis;
Varsamis, Dimitrios;
Pavlidis, George
This article presents a procedure for optical character recognition (OCR) improvement, after image preprocessing of Greek food menus images. To achieve this goal, many well-known and other more so- phisticated techniques for image preprocessing have been used. The performance of the Tesseract OCR engine has been studied for selected binarization, thresholding, noise and morphological filtering methods that applied to menu images before OCR feeding. The output text is compared to the reference text of each image (ground text) and the val- ues of evaluation indices indicate the appropriate preprocessing method. Datasets of Greek food menu images with their respective ground text files, were generated for first time in this study, due to the lack of alter- native datasets in any language. OCR outputs and ground texts were evaluated using error rate and accuracy on character and word levels. The results of OCR application on Greek menu images showed high ac- curacy values in high scanning resolution photos and in cases of menus with distinct and visible fonts.
Name | Size | |
---|---|---|
varsamisIJCO1-2022.pdf
md5:b46ecc1d39dac9d2cbbdd2b4d41ad816 |
1.3 MB | Download |
Views | 18 |
Downloads | 17 |
Data volume | 22.1 MB |
Unique views | 15 |
Unique downloads | 15 |