ExpliCA Dataset
Creators
Description
The ExpliCA dataset comprises 100 causal natural language explanations (NLEs), each meticulously paired with a set of causal triples. This dataset was developed to advance research in explainable artificial intelligence, with a focus on understanding and modeling causal relationships in text.
The dataset has been structured to enable comprehensive analysis of both original, human-curated explanations and AI-generated explanations, allowing researchers to make direct comparisons between the two. This setup supports a deeper investigation into how causal reasoning is represented in AI-generated content versus human explanations.
Original Explanations and Causal Triples
Each of the 100 curated explanations is linked with a corresponding set of causal triples, designed to capture key components of the causal relationship:
- T1: The subject or initiator of the causal relationship.
- T2: The causal verb or predicate describing the cause-effect connection.
- T3: The object or effect, representing the outcome of the causal relationship.
Generated Explanations
In addition to original explanations, the dataset includes several types of generated explanations:
- Explanations Generated from Triples: Explanations generated directly from the causal triples to assess the potential of automated explanation generation.
- Explanations Generated from Triples with Reference: Explanations generated from triples that also reference the original explanations, providing additional context and coherence.
Human, Automated and LLMa Evaluation Using the REFLEX Framework
To evaluate the quality and reliability of explanations, both human and automated evaluations were conducted, as well as evaluation using LLMs as evaluators.
- Human Evaluation of Original Explanations: Human evaluators assessed the original explanations to establish baseline quality metrics.
- Human Evaluation of Generated Explanations: Human evaluators reviewed the generated explanations (both from triples alone and with reference to original explanations) for clarity, accuracy, and consistency with causal relationships.
The REFLEX framework, as presented in the PhD thesis of Miruna Clinciu, was applied to evaluate both original and generated explanations.
Files
ExpliCA Dataset.zip
Files
(9.7 MB)
Name | Size | Download all |
---|---|---|
md5:3dc702032c391f8158e58e199cc4cbec
|
9.7 MB | Preview Download |
Additional details
Dates
- Available
-
2024-11-04Release after Thesis Submission
Software
- Repository URL
- https://github.com/MirunaClinciu/ExpliCA
- Programming language
- Python