SciR
Authors/Creators
Description
Description
SciR is a multi-document scientific-reasoning benchmark with verifiable ground truth across deduction, induction, and causal abduction, with parametric control over inference complexity and premise obfuscation. Each task is generated from a structured formal object (a deduction tree, an inductive rule hypothesis, or a causal graph) and rendered into multi-document scientific discourse via a domain-tuned, cross-validated rendering scheme. The benchmark is anchored on three canonical biology setups: developmental-biology pathway syllogisms (deduction), DrugBank drug-interaction patterns (induction), and the Sachs protein-signalling network (causal). Two axes can be dialled independently: how hard the underlying inference is, and how hard it is to extract the relevant information from heterogeneous scientific text. The release contains the task files used in the SciReason paper, covering main-tier tasks (n=200 per tier, both NL and obfuscated modes) and difficulty-scaling tasks (n=50, NL-only) across the three reasoning tracks.
Files
Files
(10.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:192c0bef7d54b8d3e8b528c44d6edce7
|
10.3 MB | Download |