Biomedical Named Entity Recognition for Clinical Notes: A Comparative Study
Description
Biomedical Named Entity Recognition (NER) is an important task in clinical natural language processing, enabling the extraction of structured medical concepts from unstructured clinical text. This study presents a practical comparison of multiple biomedical NER approaches applied to clinical notes. Three different methods were evaluated: a SciSpacy-based pipeline and transformer-based models built on BioBERT/PubMedBERT and ClinicalBERT architectures. Experiments were conducted on approximately 5,000 clinical notes collected from publicly available medical transcription samples. The comparison focuses on practical implementation aspects including handling of long clinical notes, tokenization behavior, engineering complexity, and runtime performance. Results show that the SciSpacy pipeline provides significantly faster processing and greater stability when handling large collections of clinical notes, while transformer-based models offer more flexible contextual representations at the cost of increased computational overhead. These findings highlight the trade-offs between lightweight biomedical NLP pipelines and transformer-based models for large-scale clinical text processing.
Files
Biomedical_NER_Clinical_Notes.pdf
Files
(78.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:5cb318d989883bc2cd3dc479091110b7
|
78.7 kB | Preview Download |
Additional details
Dates
- Accepted
-
2026-03-09Intial release
Software
- Repository URL
- https://github.com/kazx22/spatial-human-anatomy
- Programming language
- Python
- Development Status
- Active