Published April 16, 2023 | Version 0.6.16
Software Open

PyArabic: A Python package for Arabic text

Authors/Creators

  • 1. Bouira University, Bouira, Algeria

Description

A specific Arabic language library for Python, provides basic functions to manipulate Arabic letters and text, like detecting Arabic letters, Arabic letters groups and characteristics, remove diacritics etc.

مكتبة برمجية للغة العربية بلغة بيثون، توفر دوالا للتحكم في الحروف والنصوص، مثلا تحديد نوع الحرف، حذف الحركات، مقارنة التشكيل.

Notes

Features Arabic letters classification Text tokenization into words or sentences Strip Harakat ( all, except Shadda, tatweel, last_haraka) Sperate and join Letters and Harakat Reduce tashkeel Mesure tashkeel similarity ( Harakats, fully or partially vocalized, similarity with a template) Letters normalization ( Ligatures and Hamza) Numbers to words Extract numerical phrases Pre-vocalization of numerical phrases Unshiping texts

Files

pyarabic-0.6.16.zip

Files (430.5 kB)

Name Size Download all
md5:ddd69a8839cd568ab1f63406af0af354
430.5 kB Preview Download