Published December 11, 2024 | Version v1.0
Dataset Open

SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA

  • 1. ROR icon Åbo Akademi University

Description

In recent times, one of the most impactful applications of the growing capabilities of Large Language Models (LLMs) has been their use in Retrieval-Augmented Generation (RAG) systems. RAG applications are inherently more robust against LLM hallucinations and provide source traceability, which holds critical importance in the scientific reading and writing process. However, validating such systems is essential due to the stringent systematic requirements of the scientific domain. Existing benchmark datasets are limited in the scope of research areas they cover, often focusing on the natural sciences, which restricts their applicability and validation across other scientific fields.

To address this gap, we present a closed-question answering (QA) dataset for benchmarking scientific RAG applications. This dataset spans 34 research topics across 10 distinct areas of study. It includes 108 manually curated question-answer pairs, each annotated with answer type, difficulty level, and a gold reference along with a link to the source paper. Further details on each of these attributes can be found in the accompanying README.md file.

Please cite the following publication when using the dataset: TBD

The publication is available at: TBD

A preprint version of the publication is available at: TBD

Files

dataset.csv

Files (138.1 kB)

Name Size Download all
md5:95f1a1782b429a1c72ffdd97ab70afbe
44.4 kB Preview Download
md5:43f67d5a65c33d1732b8eb0fb9e0573b
61.4 kB Preview Download
md5:30dc3daffb19eea9f9b9c5c523ba7aad
29.6 kB Preview Download
md5:85f1abc3f406aeac4669db338c42f979
2.8 kB Preview Download

Additional details

Dates

Created
2024-12