{ "access": { "embargo": { "active": false, "reason": null }, "files": "public", "record": "public", "status": "open" }, "created": "2020-03-11T16:54:15.194541+00:00", "custom_fields": {}, "deletion_status": { "is_deleted": false, "status": "P" }, "files": { "count": 1, "enabled": true, "entries": { "codiesp_codes.zip": { "checksum": "md5:531d29c58447e86820517a5a0c437a2d", "ext": "zip", "id": "e9c27dbb-5947-4a72-9e16-f633e3295df7", "key": "codiesp_codes.zip", "metadata": null, "mimetype": "application/zip", "size": 2002661 } }, "order": [], "total_bytes": 2002661 }, "id": "3706838", "is_draft": false, "is_published": true, "links": { "access": "https://zenodo.org/api/records/3706838/access", "access_links": "https://zenodo.org/api/records/3706838/access/links", "access_request": "https://zenodo.org/api/records/3706838/access/request", "access_users": "https://zenodo.org/api/records/3706838/access/users", "archive": "https://zenodo.org/api/records/3706838/files-archive", "archive_media": "https://zenodo.org/api/records/3706838/media-files-archive", "communities": "https://zenodo.org/api/records/3706838/communities", "communities-suggestions": "https://zenodo.org/api/records/3706838/communities-suggestions", "doi": "https://doi.org/10.5281/zenodo.3706838", "draft": "https://zenodo.org/api/records/3706838/draft", "files": "https://zenodo.org/api/records/3706838/files", "latest": "https://zenodo.org/api/records/3706838/versions/latest", "latest_html": "https://zenodo.org/records/3706838/latest", "media_files": "https://zenodo.org/api/records/3706838/media-files", "parent": "https://zenodo.org/api/records/3632522", "parent_doi": "https://zenodo.org/doi/10.5281/zenodo.3632522", "parent_html": "https://zenodo.org/records/3632522", "requests": "https://zenodo.org/api/records/3706838/requests", "reserve_doi": "https://zenodo.org/api/records/3706838/draft/pids/doi", "self": "https://zenodo.org/api/records/3706838", "self_doi": "https://zenodo.org/doi/10.5281/zenodo.3706838", "self_html": "https://zenodo.org/records/3706838", "self_iiif_manifest": "https://zenodo.org/api/iiif/record:3706838/manifest", "self_iiif_sequence": "https://zenodo.org/api/iiif/record:3706838/sequence/default", "versions": "https://zenodo.org/api/records/3706838/versions" }, "media_files": { "count": 0, "enabled": false, "entries": {}, "order": [], "total_bytes": 0 }, "metadata": { "additional_descriptions": [ { "description": "Funded by the Plan de Impulso de las Tecnolog\u00edas del Lenguaje (Plan TL).", "type": { "id": "notes", "title": { "de": "Anmerkungen", "en": "Notes" } } } ], "creators": [ { "affiliations": [ { "name": "Barcelona Supercomputing Center" } ], "person_or_org": { "family_name": "Miranda-Escalada", "given_name": "Antonio", "identifiers": [ { "identifier": "0000-0002-5654-001X", "scheme": "orcid" } ], "name": "Miranda-Escalada, Antonio", "type": "personal" } }, { "affiliations": [ { "name": "Barcelona Supercomputing Center" } ], "person_or_org": { "family_name": "Krallinger", "given_name": "Martin", "identifiers": [ { "identifier": "0000-0002-2646-8782", "scheme": "orcid" } ], "name": "Krallinger, Martin", "type": "personal" } } ], "description": "
Please cite if you use this dataset:
\n\nAntonio Miranda-Escalada, Aitor Gonzalez-Agirre, Jordi Armengol-Estapé and Martin Krallinger. Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of CLEF eHealth 2020. In CLEF (Working Notes). 2020
\n\n@inproceedings{miranda2020overview,\n title={Overview of automatic clinical coding: annotations, guidelines, and solutions for non-english clinical cases at codiesp track of CLEF eHealth 2020},\n author={Miranda-Escalada, Antonio and Gonzalez-Agirre, Aitor and Armengol-Estap{\\'e}, Jordi and Krallinger, Martin},\n booktitle={Working Notes of Conference and Labs of the Evaluation (CLEF) Forum. CEUR Workshop Proceedings},\n year={2020}\n}
\n\n\n\n
This compressed folder contains two files:
\n\n + codiesp-D_codes.tsv: list of CIE10-Diagnósticos terms (2018 version) with their description in Spanish and in English.
\n + codiesp-P_codes.tsv: list of CIE10-Procedimiento terms (2018 version) with their description in Spanish and in English. In addition, the list also contains the codes until the 4th axis, which are also used in the CodiEsp-P track due to annotation reasons.
A limited number of codes do not have an English description because they were removed from the English version but maintained in the Spanish version of the terminology.
\n\n\n\n
Format:
\nTab-separated files with 3 columns
\ncode es-description en-description
\n\n
Spanish to English description mapping:
\nFor CodiEsp-D, the mapping to the English description was done through the files in the National Center for Health Statistics webpage: https://www.cdc.gov/nchs/icd/icd10cm.htm
\nSpecifically, the file used was: ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD10CM/2018/2018-ICD-10-CM-Codes-File.zip/icd10cm_codes_2018.txt
For CodiEsp-P, the mapping to the English description was done through the files in the Centers for Medicare Services webpage: https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-PCS-and-GEMs
\nSpecifically, the file used was: 2018_icd10pcs_codes_file.zip/icd10pcs_codes_2018.txt
A considerable amount of medically relevant information is hidden in large unstructured heterogeneous data collections, such as the medical literature, medicinal patents, electronic health records or specialized web-content (health blogs, patient forums or information generated by scientific and medical societies). To process more efficiently medical big data there is a growing interest in exploiting natural language processing and text mining approaches, in particularly deep learning and artificial intelligence-based strategies.
\n\nA considerable amount of medically relevant information is hidden in large unstructured heterogeneous data collections, such as the medical literature, medicinal patents, electronic health records or specialized web-content (health blogs, patient forums or information generated by scientific and medical societies). To process more efficiently medical big data there is a growing interest in exploiting natural language processing and text mining approaches, in particularly deep learning and artificial intelligence-based strategies.
\n\nThe aim of the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL), the Spanish national Plan for the Advancement of Language Technology, is to promote the development of resources of critical importance for processing textual data in Spanish as well as Catalan, Basque and Galician. The Health and biomedical domain constitute one of the flagship topics of the Spanish Plan TL.
\n\nTo promote the development of health-related language technology applications, the Plan TL is both developing and identifying resources of key relevance including individual components/libraries, terminological resources, annotated corpora and annotation guidelines, as well as document collections and language models.
", "title": "Medical NLP (maintained by NLP4BIA unit at BSC)\u2013 language technology resources for clinical and biomedical documents in multiple languages", "type": { "id": "topic" }, "website": "https://www.bsc.es/discover-bsc/organisation/research-departments/nlp-biomedical-information-analysis" }, "revision_id": 1, "slug": "medicalnlp", "updated": "2023-11-07T07:23:33.103958+00:00" } ], "ids": [ "629a6594-ebed-4f9d-aa2a-16766d76d068" ] }, "id": "3632522", "pids": { "doi": { "client": "datacite", "identifier": "10.5281/zenodo.3632522", "provider": "datacite" } } }, "pids": { "doi": { "client": "datacite", "identifier": "10.5281/zenodo.3706838", "provider": "datacite" }, "oai": { "identifier": "oai:zenodo.org:3706838", "provider": "oai" } }, "revision_id": 5, "stats": { "all_versions": { "data_volume": 525720588.0, "downloads": 278, "unique_downloads": 259, "unique_views": 1375, "views": 1557 }, "this_version": { "data_volume": 192255456.0, "downloads": 96, "unique_downloads": 91, "unique_views": 658, "views": 724 } }, "status": "published", "updated": "2021-05-07T10:02:22.939148+00:00", "versions": { "index": 2, "is_latest": true } }