RheumaMIR
Creators
- 1. Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos. Instituto de Investigación Sanitaria San Carlos (IdISSC), Prof. Martin Lagos s/n, Madrid, 28040, Spain
- 2. Reumatología. Hospital Universitario la Paz-IdiPaz, Paseo de la Castellana, 261, Madrid, 28046, Spain
- 3. Medicina Interna. Hospital Universitario del Henares, Avenida de Marie Curie, 0, Madrid, 28822, Spain
Contributors
Project leader:
- 1. Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos. Instituto de Investigación Sanitaria San Carlos (IdISSC), Prof. Martin Lagos s/n, Madrid, 28040, Spain
- 2. Reumatología. Hospital Universitario la Paz-IdiPaz, Paseo de la Castellana, 261, Madrid, 28046, Spain
- 3. Medicina Interna. Hospital Universitario del Henares, Avenida de Marie Curie, 0, Madrid, 28822, Spain
Description
This dataset accompanies the research paper entitled:
Harnessing ChatGPT and GPT-4 for Evaluating the Rheumatology Questions of the Spanish Access Exam to Specialized Medical Training.
Alfredo Madrid-García, Zulema Rosales-Rosado, Dalifer Dayanira Freites-Núñez, Inés Pérez-Sancristobal, Esperanza Pato-Cour, Chamaida Plasencia-Rodríguez, Luis Cabeza-Osorio, Lydia Abasolo Alcazar, Leticia Leon Mateos, Benjamín Fernández-Gutiérrez, Luis Rodríguez-Rodríguez
medRxiv 2023.07.21.23292821; doi: https://doi.org/10.1101/2023.07.21.23292821
The dataset contains 145 rheumatology-related questions extracted from the Spanish MIR exams held between the academic years 2009-2010 to 2022-2023. The questions are evaluated by ChatGPT, GPT-4, BARD and CLAUDE. Six rheumatologists assess the clinical reasoning of ChatGPT and GPT-4.
The dataset is made up of the following columns:
Column | Description |
Id | Question identifier |
Question (ES) | MIR exam question in Spanish |
Question (EN) | Translation of `Question (ES)` column |
Year | Academic year of the question (from 2009-2010 to 2022-2023) |
Question Type | Case or factual question |
Genre | Male, Female, Does not apply, No sex (newborn) |
Invalidated question | 0,1 (invalidated question by the Spanish Minister of Health) |
Official answer | Official answer given by the Spanish Minister of Health |
GPT-4 answer | Answer provided by GPT-4 |
Correct answer GPT-4 | 0, 1 (Whether the answer provided by GPT-4 is correct) |
Clinical reasoning GPT-4 (ES) | Clinical reasoning provided by GPT-4 |
Clinical reasoning GPT-4 (EN) | Translation of `Clinical reasoning GPT-4 (ES)` column |
Eval1_GPT4 | The score of the `Clinical reasoning GPT-4 (ES)` column given by the first evaluator |
Eval2_GPT4 | The score of the `Clinical reasoning GPT-4 (ES)` column given by the second evaluator |
Eval3_GPT4 | The score of the `Clinical reasoning GPT-4 (ES)` column given by the third evaluator |
Eval4_GPT4 | The score of the `Clinical reasoning GPT-4 (ES)` column given by the fourth evaluator |
Eval5_GPT4 | The score of the `Clinical reasoning GPT-4 (ES)` column given by the fifth evaluator |
Eval6_GPT4 | The score of the `Clinical reasoning GPT-4 (ES)` column given by the sixth evaluator |
ChatGPT answer | Answer provided by ChatGPT |
Correct answer ChatGPT | 0, 1 (Whether the answer provided by ChatGPT is correct) |
Clinical reasoning ChatGPT (ES) | Clinical reasoning provided by ChatGPT |
Clinical reasoning ChatGPT (EN) | Translation of `Clinical reasoning ChatGPT (ES)` column |
Eval1_ChatGPT | The score of the `Clinical reasoning ChatGPT (ES)` column given by the first evaluator |
Eval2_ChatGPT | The score of the `Clinical reasoning ChatGPT (ES)` column given by the second evaluator |
Eval3_ChatGPT | The score of the `Clinical reasoning ChatGPT (ES)` column given by the third evaluator |
Eval4_ChatGPT | The score of the `Clinical reasoning ChatGPT (ES)` column given by the fourth evaluator |
Eval5_ChatGPT | The score of the `Clinical reasoning ChatGPT (ES)` column given by the fifth evaluator |
Eval6_ChatGPT | The score of the `Clinical reasoning ChatGPT (ES)` column given by the sixth evaluator |
Disease category (ES) | The disease that the question addressed (Bone metabolism, Infective arthritis, Microcrystalline arthritis, Others, Rheumatoid arthritis, Scleroderma, Spondyloarthropathies, Systemic lupus erythematosus, Vasculitis ) |
Disease category (EN) | Translation of `Disease category (ES)` column |
CLAUDE answer | Answer provided by CLAUDE |
Correct answer CLAUDE | 0, 1 (Whether the answer provided by CLAUDE is correct) |
Clinical reasoning CLAUDE (ES) | Clinical reasoning provided by CLAUDE |
Clinical reasoning CLAUDE (EN) | Translation of `Clinical reasoning CLAUDE (ES)` column |
BARD answer | Answer provided by BARD |
Correct answer BARD | 0, 1 (Whether the answer provided by BARD is correct) |
Clinical reasoning BARD (ES) | Clinical reasoning provided by BARD |
Clinical reasoning BARD (EN) | Translation of `Clinical reasoning BARD (ES)` column |
The translations of the questions and the clinical reasoning from Spanish into English were done with DeepL
Files
Files
(587.7 kB)
Name | Size | Download all |
---|---|---|
md5:43358e66eeb8be177adef823ed361d7c
|
587.7 kB | Download |
Additional details
References
- Harnessing ChatGPT and GPT-4 for Evaluating the Rheumatology Questions of the Spanish Access Exam to Specialized Medical Training. Alfredo Madrid-García, Zulema Rosales-Rosado, Dalifer Dayanira Freites-Núñez, Inés Pérez-Sancristobal, Esperanza Pato-Cour, Chamaida Plasencia-Rodríguez, Luis Cabeza-Osorio, Lydia Abasolo Alcazar, Leticia Leon Mateos, Benjamín Fernández-Gutiérrez, Luis Rodríguez-Rodríguez medRxiv 2023.07.21.23292821; doi: https://doi.org/10.1101/2023.07.21.23292821