Published November 8, 2021 | Version 2022
Journal article Open

AUTHORSHIP IDENTIFICATION OF SEVEN ARABIC RELIGIOUS BOOKS -A FUSION APPROACH-

  • 1. USTHB University

Description

In this paper, we conduct an investigation of automatic authorship attribution on seven Arabic religious books, namely: the holy Quran, Hadith and five other books written by five religious scholars. The Arabic styles are almost the same (i.e. Standard Arabic) for the seven books. The genre is the same and the topics of the different books are also the same (i.e. Religion).

The authorship characterization is based on four different features: character trigrams, character tetragrams, word unigrams and word bigrams. The task of authorship identification is ensured by four conventional classifiers: Manhattan distance, Multi-Layer Perceptron, Support Vector Machines and Linear Regression. Furthermore, a fusion approach has been proposed to enhance the performances of authorship attribution, with two fusion techniques.

The novelty of this research work lies in the following points: the proposal of a new type of fusion and the proposal of a new optimal rule dealing with unbalanced text documents.

A particular application is dedicated to the authorship discrimination between the Quran and Hadith, in order to see if the two books could have the same author or not.

Results show good authorship attribution performances with an overall score ranging from 96% and 99% of correct attribution by using the conventional classifiers. This score reaches 100% of correct attribution by using the proposed fusion techniques.

Concerning the application of discrimination, results have revealed that the Quran and Hadith books are stylistically different and should belong to two different authors.

Files

19_S&H.pdf

Files (1.0 MB)

Name Size Download all
md5:5cb9099f148a1fcd18959d2f08dc27dd
1.0 MB Preview Download