Norwegian Medical Question Answering Dataset - NorMedQA

Riegler, Michael A.

doi:10.5281/zenodo.15320038

There is a newer version of the record available.

Published May 1, 2025 | Version v1

Dataset Open

Norwegian Medical Question Answering Dataset - NorMedQA

Riegler, Michael A.¹

1. Simula Research Laboratory

Contributors

Contact person:

Riegler, Michael A.¹

1. Simula Research Laboratory

This benchmark dataset consists of 1241 medical question-and-answer pairs primarily in Norwegian (Bokmål and Nynorsk), designed for evaluating Large Language Models (LLMs). The content originates from publicly available sources containing medical exam questions and has undergone cleaning and preprocessing. The dataset is structured in JSON format, with each record containing the source document name, question number (where available), the question text, and the reference answer text. It is suitable for use within evaluation frameworks such as lm-evaluation-harness (Github with config and code example: )to assess model capabilities in medical knowledge retrieval and reasoning specific to the Norwegian context.

Files

norwegian_medical_qa.json

Files (798.8 kB)

Name	Size	Download all
norwegian_medical_qa.json md5:11dbf5d18aed89d7bc64f22ed087dcd2	798.8 kB	Preview Download

867

Views

262

Downloads

Show more details

	All versions	This version
Views	867	282
Downloads	262	51
Data volume	265.5 MB	42.3 MB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: May 1, 2025
Modified: May 1, 2025

Norwegian Medical Question Answering Dataset - NorMedQA

Authors/Creators

Contributors

Contact person:

Description

Files

norwegian_medical_qa.json

Files (798.8 kB)