Published May 27, 2020 | Version 1.0
Dataset Open

CodiEsp Silver Standard: Participant predictions in eHealth CLEF2020 - Spanish clinical cases coded in ICD10 (CIE10)

  • 1. Barcelona Supercomputing Center

Description

Introduction

Predictions in the background set of eHealth CLEF 2020 Task 1 participants.

 

Zip structure

One directory per CodiEsp subtask. Within each CodiEsp subtask directory, there is one directory per team that contains the prediction runs.

 

Format
The text documents are distributed in plain text files, UTF-8 encoding.
The CodiEsp Silver Standard annotations have the following format:

For the sub-tracks CodiEsp-Diagnostic and CodiEsp-Procedure, the file files have the following fields:

articleID  ICD10-code 

Tab-separated files for the sub-track CodiEsp-X (explainability) contain extra fields that provide the text-reference and its position:

articleID label ICD10-code text-reference reference-position

 

Resources:

  • Web
  • Citation: Miranda-Escalada, A., Farré, E., & Krallinger, M. (2020). Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings.
  • Gold Standard corpus
  • Annotation guidelines
  • YouTube presentations
  • Participant codes

 

All credit to CodiEsp participants

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Files

silver-standard.zip

Files (275.4 MB)

Name Size Download all
md5:22fd892e267ba9361b08d36830c7c61f
275.4 MB Preview Download