MAFQA: Multi-hop Arabic Fatwa Question Answering Dataset
Version: 2.0
Year: 2026

================================================================================
1. Dataset Overview
================================================================================

MAFQA (Multi-hop Arabic Fatwa Question Answering) is a structured dataset
designed to support research in Arabic natural language processing, question
answering, and multi-step reasoning within Islamic jurisprudential (fatwa)
contexts.
The dataset contains complex Arabic fatwa questions that require decomposition
into multiple intermediate sub-questions, retrieval of relevant reasoning
steps, and synthesis of a final authoritative answer.

MAFQA enables the development and evaluation of decomposition-based question
answering systems, recursive reasoning frameworks, and domain-specific Arabic
language models.

================================================================================
2. Dataset Purpose
================================================================================

The primary objectives of the MAFQA dataset are:

- To support research in Arabic multi-hop question answering.
- To enable training and evaluation of decomposition-based reasoning systems.
- To facilitate research in Islamic fatwa question answering.
- To provide a high-quality benchmark for Arabic QA tasks.
- To advance research in Arabic NLP, information retrieval, and AI reasoning.

================================================================================

3. Dataset Structure
================================================================================

Each dataset record represents a single Arabic fatwa question together with its
supporting evidence and structured multi-hop reasoning annotations.
The dataset is organized as a collection of JSON objects, where each object
contains the following fields:

original_index
A unique identifier representing the original position of the question in the
source fatwa dataset.

question
The original Arabic fatwa question posed by the user.

context
The full fatwa text from which the ruling and reasoning are derived.
This passage typically includes religious references, scholarly explanations,
and supporting arguments used to justify the ruling.

decomposition
A structured representation of the reasoning process used to answer the
question. The decomposition captures the intermediate reasoning steps required
for multi-hop inference. It includes:

p_a
The first supporting passage extracted from the fatwa text.
This passage typically contains a key legal principle, rule, or concept.

p_b
The second supporting passage that complements the first passage and helps
complete the reasoning chain needed to derive the final ruling.

sq1, sq2, ...
Intermediate sub-questions derived from the main question.
These questions represent reasoning steps required to decompose the complex
fatwa question into simpler components.

sa1, sa2, ...
The answers corresponding to each sub-question.
These answers are grounded in the supporting passages and represent
intermediate reasoning outputs.

The number of sub-questions and sub-answers may vary depending on the
complexity of the fatwa question.

a_gen
The final generated answer representing the complete religious ruling derived
from integrating the reasoning steps and supporting passages.

Example Record:
{
  "original_index": 181,

  "question": "أمتلك شركة لشحن البضائع من اسطنبول إلى عمان، وأشحن البضائع منذ سنوات وفق آلية معينة، حيث يتم استلام البضائع نيابة عن أصحابها ويتم تسليمها لشركة الملاحة التي بدورها تنقلها إلى الأردن، قمت بتسليم البضائع مؤخراً لشركة الشحن يوم 24/2، وكان من المفترض شحنها يوم 28/2، لكن تأخر الشحن لغاية 2/6، وبقيت البضاعة في الميناء، ونتيجة للزلزال الذي حصل في تلك الفترة تلفت البضاعة، فهل أتحمل الضمان عن هذه البضائع أم لا؟",

  "context": "الحمد لله، والصلاة والسلام على سيدنا رسول الله. الأصل الشرعي أن الوكيل أمين لا يضمن إلا بالتعدي أو التقصير، فإن حصل منه تعدّ أو تقصير تصبح يده يد ضمان فيضمن ما يحصل من ضرر... فإذا قامت شركة الشحن وشركة الملاحة بكل الإجراءات وفق الأصول المتعارف عليها ولم يحصل تقصير أو إهمال وتلفت البضاعة بسبب الكوارث الطبيعية كالزلازل فلا ضمان على أي طرف.",

  "decomposition": {

  "p_a": "الأصل الشرعي أن الوكيل أمين لا يضمن إلا بالتعدي أو التقصير، فإذا لم يحصل منه تفريط فلا يضمن ما تلف من المال.",

    "p_b": "إذا قامت شركة الشحن وشركة الملاحة بكل الإجراءات وفق الأصول المتعارف عليها وتلفت البضاعة بسبب كارثة طبيعية كالزلازل فلا ضمان على أي طرف.",

    "sq1": "ما هو الحكم الشرعي لمسؤولية الوكيل عن المال الذي في يده؟",

    "sq2": "هل تتحمل شركة الشحن ضمان البضائع إذا تلفت بسبب كارثة طبيعية دون تقصير؟",

    "sa1": "الوكيل أمين ولا يضمن المال إلا إذا وقع منه تعدٍ أو تقصير.",

    "sa2": "إذا تلفت البضاعة بسبب كارثة طبيعية مثل الزلازل دون تقصير فلا ضمان.",

  },

  "a_gen": "إذا لم يحصل منك تقصير في إجراءات الشحن وكان تلف البضاعة بسبب كارثة طبيعية مثل الزلزال فلا ضمان عليك، أما إذا ثبت تقصير من أحد الأطراف فيتحمل المسؤولية."
}
This structured format enables training and evaluation of multi-hop question
answering systems, decomposition-based reasoning models, and advanced Arabic
NLP systems.

================================================================================
4. Intended Uses
================================================================================

The MAFQA dataset is intended for:

- Arabic question answering research
- Multi-hop reasoning research
- Natural language processing model training and evaluation
- Information retrieval research
- AI reasoning system development
- Academic and educational purposes

================================================================================

5. License
================================================================================

This dataset is licensed under the Creative Commons Attribution 4.0
International License (CC BY 4.0).

You are free to:

- Share — copy and redistribute the material
- Adapt — remix, transform, and build upon the material

Under the condition that appropriate credit is given.

License details:
https://creativecommons.org/licenses/by/4.0/

================================================================================

6. Citation
================================================================================

If you use this dataset, please cite:

Al-Qahtani, M. (2026).
MAFQA: Multi-hop Arabic Fatwa Question Answering Dataset.
Zenodo. DOI: 10.5281/zenodo.XXXXXXX

================================================================================

7. Author and Contact
================================================================================

Author:
Manal Al-Qahtani
PhD Researcher, Information Systems
King Saud University
Riyadh, Saudi Arabia

Contact:
mal-qahtani@su.edu.sa

================================================================================

8. Disclaimer
================================================================================

This dataset is intended for research purposes only. The dataset does not
constitute religious authority or legal rulings. Users are responsible for
ensuring appropriate interpretation and use.

================================================================================
