Published March 5, 2026 | Version 4.1.0
Dataset Open

Arabic WordNet 4.0

Authors/Creators

Description

Arabic WordNet 4.0 is a comprehensive lexical database for Modern Standard Arabic, derived from the Open English WordNet 2024 using the expand approach.

Features:
120,630 synsets (full OEWN 2024 parity)
136,041 lexical entries
184,238 senses
297,150 synset relations (0 skipped — exact parity with OEWN 2024)
97.3% ILI coverage for cross-linguistic linking
Full WN-LMF 1.4 XML format compliance
All synsets include Arabic definitions with full tashkeel (diacritical marks) on lemmas

What's new in v4.1.0:
+10,720 satellite adjectives (pos=s) — completing full OEWN 2024 adjective coverage
+9 missing hub verbs (act/move, change, travel, make, communicate, and others)
+78 upper-ontology noun synsets completing the noun hierarchy
All 8 validation checks pass against OEWN 2024

Methodology:
Initial 109,823 synsets (nouns, verbs, adjectives, adverbs) were generated using AI-assisted translation (Google Gemini 3 Pro Preview). The remaining 10,807 synsets (satellite adjectives, hub verbs, upper-ontology nouns) were translated using Anthropic Claude via an automated Docker pipeline.

Attribution:
Derived from Open English WordNet 2024 (https://en-word.net/) and Princeton WordNet 3.0 (https://wordnet.princeton.edu/), both licensed under CC BY 4.0.

Files

Files (11.8 MB)

Name Size Download all
md5:4bcb600a96581e7b644ca52af6c6f31c
11.8 MB Download

Additional details

Related works

Is derived from
Dataset: https://en-word.net/ (URL)
Dataset: https://wordnet.princeton.edu/ (URL)

Software

References