BioASQ-QA: A manually curated corpus for Biomedical Question Answering
- 1. National Center for Scientific Research (NCSR) "Demokritos"
Description
The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golden standard (reference) answers and related material. The dataset has been designed to reflect real information needs of biomedical experts and is therefore more realistic and challenging than most existing datasets. Furthermore, unlike most previous QA benchmarks that contain only exact answers, the BioASQ-QA dataset also includes ideal answers (in effect summaries), which are particularly useful for research on multi-document summarization. The dataset combines structured and unstructured data. The material linked with each question comprise documents and snippets, which are useful for Information Retrieval and Passage Retrieval experiments, as well as concepts that are useful in concept-to-text Natural Language Generation. Researchers working on paraphrasing and textual entailment can also measure the degree to which their methods improve the performance of biomedical QA systems. Last but not least, the dataset is continuously extended, as the BioASQ challenge is running and new data are generated.
Files
training11b.json
Files
(37.6 MB)
Name | Size | Download all |
---|---|---|
md5:fc1fe03831b69157c82a746337c00712
|
37.6 MB | Preview Download |
Additional details
Funding
- International Workshop on Large-scale Biomedical Semantic Indexing and Question Answering (BioASQ) 5R13LM012214-03
- National Institutes of Health
- International Workshop on Large-scale Biomedical Semantic Indexing and Question Answering (BioASQ) 5R13LM012214-02
- National Institutes of Health
- BIOASQ – A challenge on large-scale biomedical semantic indexing and question answering 318652
- European Commission
References
- Anastasia Krithara, Anastasios Nentidis, Konstantinos Bougiatiotis, Georgios Paliouras. BioASQ-QA: A manually curated corpus for Biomedical Question Answering. bioRxiv 2022.12.14.520213; doi: https://doi.org/10.1101/2022.12.14.520213