CafeteriaSA corpus: Scientific abstracts annotated across different food semantic resources

doi:10.5281/zenodo.6683798

Published June 22, 2022 | Version 1.0

Dataset Open

CafeteriaSA corpus: Scientific abstracts annotated across different food semantic resources

1. Jožef Stefan Institute
2. Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University - Skopje
3. European Food Safety Authority

In the last decades, a great amount of work has been done in predictive modeling of issues related to human and environmental health. Resolution of issues related to healthcare is made possible by the existence of several biomedical vocabularies and standards, which play a crucial role in understanding health information, together with a large amount of health-related data. However, despite the large number of available resources and work done in the health and environmental domains, there is a lack of semantic resources that can be utilized in the food and nutrition domain, as well as their interconnections. For this purpose, in an European Food Safety Authority-funded project CAFETERIA, we have developed the first annotated corpus of 500 scientific abstracts that consists of 6,407 annotated food entities with regard to Hansard taxonomy, 4,299 for FoodOn, and 3,623 for SNOMED-CT. The CafeteriaSA corpus will enable further development of natural language processing methods for food information extraction from textual data that will allow extracting of food information from scientific textual data.

Files

CafeteriaSA_Food.xml

Files (7.1 MB)

Name	Size	Download all
CafeteriaSA_Food.xml md5:6ce10cce1b4ba7b6686f015f6f2f4324	1.8 MB	Preview Download
CafeteriaSA_FOODON.xml md5:73052dfc53c7da08dd06fbe8a3f9298a	1.9 MB	Preview Download
CafeteriaSA_Hansard.xml md5:e478ff9b18c2b04848b85010003ea64d	1.8 MB	Preview Download
CafeteriaSA_SNOMEDCT.xml md5:da960f71082c4546c89cb5d75298f0bc	1.6 MB	Preview Download

	All versions	This version
Views	656	643
Downloads	97	97
Data volume	203.8 MB	203.8 MB

CafeteriaSA corpus: Scientific abstracts annotated across different food semantic resources

Creators

Description

Files

CafeteriaSA_Food.xml

Files (7.1 MB)