Published March 9, 2026 | Version v1
Publication Open

Biomedical Named Entity Recognition for Clinical Notes: A Comparative Study

  • 1. Independent Researcher

Description

Biomedical Named Entity Recognition (NER) is an important task in clinical natural language processing, enabling the extraction of structured medical concepts from unstructured clinical text. This study presents a practical comparison of multiple biomedical NER approaches applied to clinical notes. Three different methods were evaluated: a SciSpacy-based pipeline and transformer-based models built on BioBERT/PubMedBERT and ClinicalBERT architectures. Experiments were conducted on approximately 5,000 clinical notes collected from publicly available medical transcription samples. The comparison focuses on practical implementation aspects including handling of long clinical notes, tokenization behavior, engineering complexity, and runtime performance. Results show that the SciSpacy pipeline provides significantly faster processing and greater stability when handling large collections of clinical notes, while transformer-based models offer more flexible contextual representations at the cost of increased computational overhead. These findings highlight the trade-offs between lightweight biomedical NLP pipelines and transformer-based models for large-scale clinical text processing.

Files

Biomedical_NER_Clinical_Notes.pdf

Files (78.7 kB)

Name Size Download all
md5:5cb318d989883bc2cd3dc479091110b7
78.7 kB Preview Download

Additional details

Dates

Accepted
2026-03-09
Intial release

Software

Repository URL
https://github.com/kazx22/spatial-human-anatomy
Programming language
Python
Development Status
Active