Published November 26, 2025 | Version v3
Dataset Open

ALD/E Neuro-symbolic Query Benchmark: 33 Scientific Queries over Machine-Actionable ORKG Comparisons

  • 1. ROR icon Technische Informationsbibliothek (TIB)
  • 2. ROR icon Eindhoven University of Technology
  • 3. ROR icon University of Warwick
  • 4. Merck KGaA

Description

This record contains the ALD/E Neuro-symbolic Query Dataset, a curated collection of 33 scientific queries (19 ALD, 14 ALE) defined over machine-actionable Open Research Knowledge Graph (ORKG) comparisons extracted from published review tables.

Each query bundle includes:

  • a natural-language question (brief + detailed forms),

  • the corresponding SPARQL gold-standard query,

  • CSV exports of the underlying ORKG comparison tables,

  • symbolic results (results_SPARQL.csv),

  • neural and symbolic-context-augmented results from 21 language-model systems,

  • machine-readable metadata linking to the source paper, DOI, ORKG comparison IDs, and query type.

The dataset supports research in NL→SPARQL translation, scientific table QA, symbolic vs neural vs neurosymbolic evaluation, and reproducible meta-analysis of ALD/E processes.
It also includes domain-expert survey assessments of query clarity and result quality.

The resource is intended for materials scientists seeking FAIR, queryable ALD/E knowledge, and for AI researchers developing models that connect natural-language questions with graph-structured scientific evidence.

Files

readme.md

Files (34.2 MB)

Name Size Download all
md5:056e249bb39874bb017fd756f3b838c3
20.8 MB Preview Download
md5:76c7e38c44877560b761b2afc0707ccd
11.7 MB Preview Download
md5:9d4c2180981e97c57ad03a660ccbc1b7
16.1 kB Download
md5:1d2f8d22766e6fda440f08fa957604e1
4.3 kB Preview Download
md5:986f3b44dcfd8b0eefc2f2c347fdba12
559.4 kB Preview Download
md5:132046d78c64893b5902791705332758
571.1 kB Preview Download
md5:e1b6637ab4028a383b0f5b205b368cfd
563.2 kB Preview Download

Additional details

Software