Adverse Drug Reaction (ADR) Text Dataset

Monko, Gloriana

doi:10.5281/zenodo.15191367

Published April 11, 2025 | Version v2

Dataset Open

Adverse Drug Reaction (ADR) Text Dataset

Monko, Gloriana (Researcher)^{1, 2}

1. University of Dodoma
2. Shibaura Institute of Technology

This repository contains text data and code related to the identification and clustering of Adverse Drug Reactions (ADR) using Sentence-BERT (S-BERT) embeddings and the SS-DBSCAN clustering algorithm. The dataset includes both labeled and unlabeled patient reports extracted from the publicly available MIMIC-III database.

The labeled data has been manually annotated to distinguish between ADR and non-ADR cases. The unlabeled dataset is used for unsupervised clustering experiments, particularly to assess high-dimensional data clustering performance.

New in This Version:
- Added Jupyter Notebook: `mimic-5k_PCA_tSNE_clustering.ipynb`
- Included detailed `README_ADR_Clustering_Task.txt` with step-by-step instructions to reproduce clustering results
- Explained how to scale experiments from 1,000 to full dataset size

Files

adr_filtered.csv

Files (23.1 MB)

Name	Size	Download all
adr_filtered.csv md5:c80ef52b48bd9879aa173252cabed7cb	20.2 MB	Preview Download
mimic-5k_PCA_tSNE_clustering.ipynb md5:d0aede0ba8c0a9b0639bc24306600a33	3.0 MB	Preview Download
README_ADR_Clustering_Task.txt md5:b78c5cac4bc48e5b9e46dc78c0cf552d	2.5 kB	Preview Download

Additional details

Is part of: Dataset: https://physionet.org/content/mimiciii-demo/1.4/ (URL)

Submitted: 2025-04-11

Programming language: Python

	All versions	This version
Views	663	110
Downloads	760	204
Data volume	7.7 GB	3.5 GB

Adverse Drug Reaction (ADR) Text Dataset

Files

adr_filtered.csv

Files (23.1 MB)

Additional details

Related works

Dates

Software

Adverse Drug Reaction (ADR) Text Dataset

Creators

Description

Files

adr_filtered.csv

Files (23.1 MB)

Additional details

Related works

Dates

Software