Published September 15, 2025 | Version v1.0.0
Dataset Open

SynthMedic Dataset

  • 1. ROR icon Sofia University "St. Kliment Ohridski"
  • 2. ROR icon Multiprofile Hospital for Active Treatment in Neurology and Psychiatry St. Naum
  • 3. ROR icon Medical University of Sofia
  • 4. Graphwise

Description

SynthMedic: Utilizing large language models for synthetic discharge summary generation, correction and validation

Citation

If you use our dataset, please cite the following work (accepted for publication):

Grazhdanski, G., Vasilev, V., Vassileva, S., Taskov, D., Antova, I., Koychev, I., and Boytcheva, S. (2025). SynthMedic: Utilizing large language models for synthetic discharge summary generation, correction and validation. Accepted for Publication in the Journal of Biomedical Informatics.

Files

ggrazh/synthetic-clinical-corpus-v1.0.0.zip

Files (1.9 MB)

Name Size Download all
md5:efa85b08871e585a40be0920536bdf7b
1.9 MB Preview Download

Additional details

Related works

Is described by
Journal: 10.1016/j.jbi.2025.104906 (DOI)
Is supplement to
Dataset: https://github.com/ggrazh/synthetic-clinical-corpus/tree/v1.0.0 (URL)

Funding

European Union
European Union (EU)-Next-GenerationEU, through the National Recovery and Resilience Plan of the Republic of Bulgaria BG-RRP-2.004-0008
European Union
Horizon Europe Research and Innovation Program project RES-Q plus 101057603

Dates

Available
2025-09-15