Published September 10, 2021 | Version 1.1
Dataset Open

FONA corpus: Food & Nutrition Abstracts Multilingual corpus

  • 1. Barcelona Supercomputing Center

Description

The FONA corpus is a collection of case reports specifically selected to foster the development of Language Technologies, Text Mining and NLP for applications in the domain of food & nutrition.

 

It contains a large collection of documents (titles and abstracts) with metadata information on their MeSH terms. In addition, a subset of the collection contains automatically recognized entities of the following categories:

  • medical procedures
  • symptoms
  • diseases
  • medications
  • occupational and demographic information
  • species (pathogens)
  • cancer morphology

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Files

iberhelt.zip

Files (12.1 MB)

Name Size Download all
md5:d0d4c00d84c1536f652f2c158e86d0e7
12.1 MB Preview Download