There is a newer version of the record available.

Published May 1, 2025 | Version v1
Dataset Open

Norwegian Medical Question Answering Dataset - NorMedQA

Authors/Creators

  • 1. ROR icon Simula Research Laboratory

Contributors

Contact person:

  • 1. ROR icon Simula Research Laboratory

Description

This benchmark dataset consists of 1241 medical question-and-answer pairs primarily in Norwegian (Bokmål and Nynorsk), designed for evaluating Large Language Models (LLMs). The content originates from publicly available sources containing medical exam questions and has undergone cleaning and preprocessing. The dataset is structured in JSON format, with each record containing the source document name, question number (where available), the question text, and the reference answer text. It is suitable for use within evaluation frameworks such as lm-evaluation-harness (Github with config and code example: )to assess model capabilities in medical knowledge retrieval and reasoning specific to the Norwegian context.

Files

norwegian_medical_qa.json

Files (798.8 kB)

Name Size Download all
md5:11dbf5d18aed89d7bc64f22ed087dcd2
798.8 kB Preview Download