Published December 19, 2022 | Version BioASQ11
Dataset Open

BioASQ-QA: A manually curated corpus for Biomedical Question Answering

  • 1. National Center for Scientific Research (NCSR) "Demokritos"

Description

The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golden standard (reference) answers and related material. The dataset has been designed to reflect real information needs of biomedical experts and is therefore more realistic and challenging than most existing datasets. Furthermore, unlike most previous QA benchmarks that contain only exact answers, the BioASQ-QA dataset also includes ideal answers (in effect summaries), which are particularly useful for research on multi-document summarization. The dataset combines structured and unstructured data. The material linked with each question comprise documents and snippets, which are useful for Information Retrieval and Passage Retrieval experiments, as well as concepts that are useful in concept-to-text Natural Language Generation. Researchers working on paraphrasing and textual entailment can also measure the degree to which their methods improve the performance of biomedical QA systems. Last but not least, the dataset is continuously extended, as the BioASQ challenge is running and new data are generated.

Files

training11b.json

Files (37.6 MB)

Name Size Download all
md5:0b006c8b2ed78926c7227cf96726dff1
1.3 kB Download
md5:fc1fe03831b69157c82a746337c00712
37.6 MB Preview Download

Additional details

Funding

International Workshop on Large-scale Biomedical Semantic Indexing and Question Answering (BioASQ) 5R13LM012214-03
National Institutes of Health
International Workshop on Large-scale Biomedical Semantic Indexing and Question Answering (BioASQ) 5R13LM012214-02
National Institutes of Health
BIOASQ – A challenge on large-scale biomedical semantic indexing and question answering 318652
European Commission

References

  • Anastasia Krithara, Anastasios Nentidis, Konstantinos Bougiatiotis, Georgios Paliouras. BioASQ-QA: A manually curated corpus for Biomedical Question Answering. bioRxiv 2022.12.14.520213; doi: https://doi.org/10.1101/2022.12.14.520213