This folder contains all datasets and a collection of scripts used in the
experiments in the LPAR 2020 submission 44. To recreate the datasets using the
scripts, they should be executed in the order listed below.

=================================================================================
*********************************************************************************
=================================================================================

These scripts were executed on machines with 
 - macOS 10.14.4, JDK 12
 - Debian GNU/Linux 9, JDK 11

=================================================================================
*********************************************************************************
=================================================================================

Tasks extraction is based on ontologies obtained from the OWL 2 EL classification
track of the 2015 OWL Reasoner Evaluation (ORE) competition.
The datasets can be downloaded from https://zenodo.org/record/18578#.XlekOBdG2L4

Ontologies used in the experiments are located in el/classification/


### el-extract_tasks.sh
This script is used to extract explanation tasks from a collection of ontologies.
It uses two files, el-processed.txt and el-exceptions.txt, to keep track of which
ontologies have already been processed and for which an exception was thrown,
respectively, in case the script is interrupted and needs to be resumed later.
The output format of the tasks is a custom JSON format for storing proofs as
collections of inference steps (tasks are viewed as single-step proofs from a
justification to a conclusion). Tasks are unique up to renaming of concept and
role names and use anonymized names like A,B,...,r,s,... etc.

Input: el/classification/*.owl
       el-processed.txt (list of processed ontologies)
       el-exceptions.txt (list of thrown exceptions, will be skipped next time)
       el-tasks/taskXXXXX.json (tasks extracted in previous run of the script)
Output: el-tasks/taskYYYYY.json (merged tasks)
        el-tasks/numbYYYYY.txt (number of occurrences of taskXXXXX in input)


### el-generate_???.sh
These 3 scripts generate proofs for the explanation tasks, where ??? can be
either ELK, LETHE, or FAME.

Input: el-tasks/taskXXXXX.json
Output: el-proofs-???/proofXXXXX.json


### el-unique_elk.sh
This script extracts unique proofs from the derivation structures extracted from
ELK in the previous step.

Input: el-proofs-ELK/proofXXXXX.json
Output: el-proofs-ELK/unique/proofXXXXX.json (hypergraphs)
        el-proofs-ELK/trees/proofXXXXX.json (trees)
        el-proofs-ELK/unique/times.csv (average runtime in ms for each proof)
        el-proofs-ELK/trees/times.csv


### el-extract_rules.sh
This script extracts all 'rules' that are used throughout any of the generated
proofs, i.e. unique inference steps up to renaming of concept and role names.

Input: el-proofs-*/proofXXXXX.json
       el-rules/ruleYYYYY.json (potentially already extracted rules)
Output: el-rules/ruleYYYYY.json (merged with existing rules)
        el-rules/numbYYYYY.txt (number of occurrences of ruleYYYYY in input)


### el-rules_stats.sh and el-proofs_stats.sh
These script compute statistics for the generated rules and proofs and write
them into CSV files (also for the 'unique' and 'trees' subfolders).

Input: el-proofs-*/proofXXXXX.json
       el-tasks/taskXXXXX.json (used for checking correctness of each proof)
       el-rules/ruleYYYYY.json
Output: el-proofs-*/stats.csv
        el-rules/stats.csv


### el-rules_images.sh and el-proofs_images.sh
These script generate PNG images for all rules and proofs (also for the
'unique' and 'trees' subfolders).

Input: el-proofs-*/proofXXXXX.json
       el-rules/ruleYYYYY.json
Output: el-proofs-*/proofXXXXX.png
        el-rules/ruleYYYYY.png


### el-proofs_charts.sh
This scripts generates some charts from the extracted statistics.

Input: el-proofs-ELK/unique/stats.csv
       el-proofs-FAME/stats.csv
       el-proofs-LETHE/stats.csv
Output: el-charts/???.csv & ???.txt ???.pdf (chart data, labels, and output)

