RheumaMIR

doi:10.5281/zenodo.8190084

Published July 27, 2023 | Version 2.0.0

Dataset Open

RheumaMIR

1. Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos. Instituto de Investigación Sanitaria San Carlos (IdISSC), Prof. Martin Lagos s/n, Madrid, 28040, Spain
2. Reumatología. Hospital Universitario la Paz-IdiPaz, Paseo de la Castellana, 261, Madrid, 28046, Spain
3. Medicina Interna. Hospital Universitario del Henares, Avenida de Marie Curie, 0, Madrid, 28822, Spain

Project leader:

Alfredo Madrid García¹

Researchers:

1. Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos. Instituto de Investigación Sanitaria San Carlos (IdISSC), Prof. Martin Lagos s/n, Madrid, 28040, Spain
2. Reumatología. Hospital Universitario la Paz-IdiPaz, Paseo de la Castellana, 261, Madrid, 28046, Spain
3. Medicina Interna. Hospital Universitario del Henares, Avenida de Marie Curie, 0, Madrid, 28822, Spain

This dataset accompanies the research paper entitled:

Harnessing ChatGPT and GPT-4 for Evaluating the Rheumatology Questions of the Spanish Access Exam to Specialized Medical Training.

Alfredo Madrid-García, Zulema Rosales-Rosado, Dalifer Dayanira Freites-Núñez, Inés Pérez-Sancristobal, Esperanza Pato-Cour, Chamaida Plasencia-Rodríguez, Luis Cabeza-Osorio, Lydia Abasolo Alcazar, Leticia Leon Mateos, Benjamín Fernández-Gutiérrez, Luis Rodríguez-Rodríguez

medRxiv 2023.07.21.23292821; doi: https://doi.org/10.1101/2023.07.21.23292821

The dataset contains 145 rheumatology-related questions extracted from the Spanish MIR exams held between the academic years 2009-2010 to 2022-2023. The questions are evaluated by ChatGPT, GPT-4, BARD and CLAUDE. Six rheumatologists assess the clinical reasoning of ChatGPT and GPT-4.

The dataset is made up of the following columns:

Column	Description
Id	Question identifier
Question (ES)	MIR exam question in Spanish
Question (EN)	Translation of `Question (ES)` column
Year	Academic year of the question (from 2009-2010 to 2022-2023)
Question Type	Case or factual question
Genre	Male, Female, Does not apply, No sex (newborn)
Invalidated question	0,1 (invalidated question by the Spanish Minister of Health)
Official answer	Official answer given by the Spanish Minister of Health
GPT-4 answer	Answer provided by GPT-4
Correct answer GPT-4	0, 1 (Whether the answer provided by GPT-4 is correct)
Clinical reasoning GPT-4 (ES)	Clinical reasoning provided by GPT-4
Clinical reasoning GPT-4 (EN)	Translation of `Clinical reasoning GPT-4 (ES)` column
Eval1_GPT4	The score of the `Clinical reasoning GPT-4 (ES)` column given by the first evaluator
Eval2_GPT4	The score of the `Clinical reasoning GPT-4 (ES)` column given by the second evaluator
Eval3_GPT4	The score of the `Clinical reasoning GPT-4 (ES)` column given by the third evaluator
Eval4_GPT4	The score of the `Clinical reasoning GPT-4 (ES)` column given by the fourth evaluator
Eval5_GPT4	The score of the `Clinical reasoning GPT-4 (ES)` column given by the fifth evaluator
Eval6_GPT4	The score of the `Clinical reasoning GPT-4 (ES)` column given by the sixth evaluator
ChatGPT answer	Answer provided by ChatGPT
Correct answer ChatGPT	0, 1 (Whether the answer provided by ChatGPT is correct)
Clinical reasoning ChatGPT (ES)	Clinical reasoning provided by ChatGPT
Clinical reasoning ChatGPT (EN)	Translation of `Clinical reasoning ChatGPT (ES)` column
Eval1_ChatGPT	The score of the `Clinical reasoning ChatGPT (ES)` column given by the first evaluator
Eval2_ChatGPT	The score of the `Clinical reasoning ChatGPT (ES)` column given by the second evaluator
Eval3_ChatGPT	The score of the `Clinical reasoning ChatGPT (ES)` column given by the third evaluator
Eval4_ChatGPT	The score of the `Clinical reasoning ChatGPT (ES)` column given by the fourth evaluator
Eval5_ChatGPT	The score of the `Clinical reasoning ChatGPT (ES)` column given by the fifth evaluator
Eval6_ChatGPT	The score of the `Clinical reasoning ChatGPT (ES)` column given by the sixth evaluator
Disease category (ES)	The disease that the question addressed (Bone metabolism, Infective arthritis, Microcrystalline arthritis, Others, Rheumatoid arthritis, Scleroderma, Spondyloarthropathies, Systemic lupus erythematosus, Vasculitis )
Disease category (EN)	Translation of `Disease category (ES)` column
CLAUDE answer	Answer provided by CLAUDE
Correct answer CLAUDE	0, 1 (Whether the answer provided by CLAUDE is correct)
Clinical reasoning CLAUDE (ES)	Clinical reasoning provided by CLAUDE
Clinical reasoning CLAUDE (EN)	Translation of `Clinical reasoning CLAUDE (ES)` column
BARD answer	Answer provided by BARD
Correct answer BARD	0, 1 (Whether the answer provided by BARD is correct)
Clinical reasoning BARD (ES)	Clinical reasoning provided by BARD
Clinical reasoning BARD (EN)	Translation of `Clinical reasoning BARD (ES)` column

The translations of the questions and the clinical reasoning from Spanish into English were done with DeepL

Files

Files (587.7 kB)

Name	Size	Download all
RheumaMIR.xlsx md5:43358e66eeb8be177adef823ed361d7c	587.7 kB	Download

Additional details

Harnessing ChatGPT and GPT-4 for Evaluating the Rheumatology Questions of the Spanish Access Exam to Specialized Medical Training. Alfredo Madrid-García, Zulema Rosales-Rosado, Dalifer Dayanira Freites-Núñez, Inés Pérez-Sancristobal, Esperanza Pato-Cour, Chamaida Plasencia-Rodríguez, Luis Cabeza-Osorio, Lydia Abasolo Alcazar, Leticia Leon Mateos, Benjamín Fernández-Gutiérrez, Luis Rodríguez-Rodríguez medRxiv 2023.07.21.23292821; doi: https://doi.org/10.1101/2023.07.21.23292821

	All versions	This version
Views	281	36
Downloads	30	7
Data volume	20.7 MB	4.1 MB

RheumaMIR

Creators

Contributors

Project leader:

Researchers:

Description

Files

Files (587.7 kB)

Additional details

References