Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

doi:10.5281/zenodo.4444738

Published January 16, 2021 | Version v1

Journal article Open

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

Omar ASBAYOU¹

1. Department of LEA, Lumière Lyon 2 University, Lyon, France

This article tries to explain our rule-based Arabic Named Entity recognition (NER) and classification system. It is based on lists of classified proper names (PN) and particularly on syntactico-semantic patterns resulting in fine classification of Arabic NE. These patterns use syntactico-semantic combination of morpho-syntactic and syntactic entities. It also uses lexical classification of trigger words and NE extensions. These linguistic data are essential not only to name entity extraction but also to the taxonomic classification and to determining the NE frontiers. Our method is also based on the contextualisation and on the notion of NE class attributes and values. Inspired from X-bar theory and immediate constituents, we built a rule-based NER system composed of five levels of syntactico-semantic combination. We also show how the fine NE annotations in our system output (XML database) is exploited in information retrieval and information extraction.

Files

9620ijnlc01.pdf

Files (2.2 MB)

Name	Size	Download all
9620ijnlc01.pdf md5:b8ece3178d938034f0e528a17068f296	2.2 MB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	37	37
Downloads	73	72
Data volume	161.1 MB	158.9 MB

More info on how stats are collected....

DOI

Resource type

Journal article

Publisher

Zenodo

Published in

International Journal on Natural Language Computing (IJNLC), 9(6), 1-22, 2021.

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: January 16, 2021
Modified: July 19, 2024

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

Creators

Description

Files

9620ijnlc01.pdf

Files (2.2 MB)