Diabetic Macular Edema VQA Dataset
Description
Medical VQA dataset built from the IDRiD and eOphta datasets. The dataset contains both healthy and unhealthy fundus images. For each image, a set of pre-defined questions is generated, including questions about regions (e.g. are there hard exudates in this region?), for which an associated mask denotes the location of the region.
The motivation for this dataset includes the lack of public medical VQA datasets with related questions. In our dataset, questions are related because there is a high-level question about the DME grade of the image, and associated low-level questions that can lead to the answer of the high-level question. This allows to study the consistency of a VQA model i.e. how often the model produces contradictory answers to questions about a given image. Questions about regions are also a novel feature of this dataset.
The dataset can be used for general VQA purposes, and also for the more specific purpose of consistency improvement.
Number of images : Train: 433 Val: 112 Test: 134
Number of QA pairs: Train: 9779 Val: 2380 Test: 1311
More details can be found here.
If you use this dataset, please make sure you cite our paper:
@inproceedings{tascon2022consistency,
title={Consistency-Preserving Visual Question Answering in Medical Imaging},
author={Tascon-Morales, Sergio and Márquez-Neila, Pablo and Sznitman, Raphael},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={386--395},
year={2022},
organization={Springer}
}
Do you need annotations about logical relations? No problem; check out our DME VQA dataset with logical relations.
Files
dme_vqa.zip
Files
(36.7 MB)
Name | Size | Download all |
---|---|---|
md5:bf1f328b9ef6eb699bfad4b5072284a4
|
36.7 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Conference paper: 10.1007/978-3-031-16452-1_37 (DOI)