Dataset Open Access
Meaning and Understanding in Human-Centric AI (MUHAI) Benchmark
Task 3 Understanding complex concepts
This dataset helps investigating whether symbolic reasoning can help statistical models tro understand complex concepts. Complex concepts are expressed in the form of Image Schemas (i.e., mental templates that summarise human experiences in the form of patterns of object relations and actions).
The submission includes the the ImageSchema dataset with ground truth labels :
1. A question to be asked
2. the type of Image schema (class)
3. the type of phrasing (literal , metaphoric, a distracting sentence)
4. the type of questioning (one referring to the image schema by name, and another describing its content)
5. a question indicating whether it is a yes or no answer
6. the image schema variables identified
Each sample in the datasets starts with a question about the presence of the given schema in the following sentence, and follows with a single sentence to be classified as either "yes" or "no" (presence or absence of a schema).
This can be used by a system (eg a language model, a symbolic system, a neuro-symbolic approach) to identify image schemas. The file "language-models.csv" includes the results of two language models that were tested (T0pp, GPT-3).
Metrics used to evaluate:
1. Accuracy : no. correct predictions / no. of total sentences (TP + TN / P + N)
2. Precision: no. correct image schema predictions / total correct image schema predictions (TP / TP + FP)
3. Recall : : no. correct image schema predictions / total predicted image schema (TP / TP + FN)
4. F1 : harmonic mean of Precision and Recall
Code : https://github.com/kmitd/muhai-EPL