Dataset Open Access

SPACCC_TOKEN

Montserrat Marimon; Martin Krallinger; Aitor Gonzalez Agirre; Marta Villegas; Ander Intxaurrondo

[PlanTL/medicine/annotated corpus/guidelines/tokenization] First version of the tokenization annotations in the Spanish Clinical Case Corpus that have been carried out by means of the Spanish Clinical Case Corpus Part-of-Speech Tagger based on FreeLing3.1 (SPACCC_POS-TAGGER, https://github.com/PlanTL/SPACCC_POS-TAGGER).

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).
Files (13.0 MB)
Name Size
SPACCC_TOKEN.zip
md5:0c5696260771cdb2ff37963009e5be3c
13.0 MB Download
  • Villegas M, de la Peña S, Intxaurrondo A, Santamaria J, Krallinger M. Esfuerzos para fomentar la minería de textos en biomedicina más allá del inglés: el plan estratégico nacional español para las tecnologías del lenguaje. Procesamiento del Lenguaje Natural. 2017(59):141-4.

110
36
views
downloads
All versions This version
Views 11073
Downloads 3621
Data volume 486.4 MB273.3 MB
Unique views 9066
Unique downloads 3220

Share

Cite as