Published November 1, 2021 | Version v1
Journal article Open

Framework of diacritic segmentation for Arabic handwritten document

  • 1. Fakulti Teknologi Maklumat dan Komunikasi, Universiti Teknikal Malaysia Melaka (UTeM), Melaka, Malaysia

Description

In recent Arabic standard language and Arabic dialectal texts, diacritics and short vowels are absent. There are some exceptions have been made for the Arabic beginner learner scripts, religious texts and as well as a significant political text. In addition, the text without diacritics is considered ambiguous due to numerous words with different diacritic marks seem identical. However, this paper we present a framework for segmenting diacritics from Arabic handwritten document by using region-based segmentation technique. Since Arabic handwritten and Mushaf Al-Quran contain many diacritical marks. Hence, the diacritics must be properly extracted from Arabic handwritten document to avoid losing some good features. Furthermore, the proposed framework is devised specifically to segment diacritics from Arabic handwritten image, thus there will be no feature extraction, feature selection, and classification processes included. Besides, we will present the methodology that is used to fulfil the objectives of this paper. The preprocessing phases will be explained and more specifically segmentation phase for segmenting diacritics which is the phase we concentrate more in this article. Lastly, we will identify the proposed technique region-based segmentation to facilitate our development throughout the experimental process.

Files

41 24249.pdf

Files (508.3 kB)

Name Size Download all
md5:9a5644afed536c4cc1c6a97313b3480f
508.3 kB Preview Download