Published November 27, 2018 | Version 2019-02-01
Dataset Open

SPACCC

  • 1. ROR icon Barcelona Supercomputing Center

Description

[PlanTL/medicine/document]

The Spanish Clinical Case Corpus (SPACCC) corpus corresponds to a manually classified collection of clinical case reports (specifically clinical case description sections) derived from open access Spanish medical publications (SciELO). The SPACCC corpus contained a total of 1,000 clinical cases / 396,988 words. It is noteworthy that this kind of narrative shows properties of both the biomedical and medical literature, as well as clinical records (e.g. discharge summaries).

The SPACCC clinical cases were not restricted to a single medical discipline, covering a variety of medical specialities, including oncology, urology, cardiology, pneumology or infectious diseases. 

 

Copyright (c) 2018 Secretaría de Estado para el Avance Digital

 

Gonzalez-Agirre A, Marimon M, Intxaurrondo A, Rabal O, Villegas M, Krallinger M. Pharmaconer: Pharmacological substances, compounds and proteins named entity recognition track. InProceedings of The 5th Workshop on BioNLP Open Shared Tasks 2019 Nov (pp. 1-10).

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Files

SPACCC.zip

Files (1.3 MB)

Name Size Download all
md5:4223f56323b24c4cf9cf1d5792c9c56c
1.3 MB Preview Download

Additional details

References

  • Intxaurrondo A, Marimon M, Gonzalez-Agirre A, Lopez-Martin JA, Rodriguez H, Santamaria J, Villegas M, Krallinger M. Finding Mentions of Abbreviations and Their Definitions in Spanish Clinical Cases: The BARR2 Shared Task Evaluation Results. InIberEval@ SEPLN 2018 (pp. 280-289).