Published December 14, 2021 | Version v1
Conference paper Open

Token-level Multilingual Epidemic Dataset for Event Extraction

  • 1. University of La Rochelle, Multimedia University, Nairobi, Kenya
  • 2. University of La Rochelle
  • 3. Sorbonne University
  • 4. University of Innsbruck
  • 5. Multimedia University, Nairobi, Kenya

Description

In this paper, we present a dataset and a baseline evaluation for multilingual epidemic event extraction. We experiment with a multilingual news dataset which we annotate at the token level, a common tagging scheme utilized in event extraction systems. We approach the task of extracting epidemic events by first detecting the relevant documents from a large collection of news reports. Then, event extraction (disease names and locations) is performed on the detected relevant documents. Preliminary experiments with the entire dataset and with ground-truth relevant documents showed promising results, while also establishing a stronger baseline for epidemiological event extraction.

Dataset

In addition to the paper, you may also be interested in the datasets.

Files

TPDL_2021_Token_level_Multilingual_Epidemic_Dataset_for_Event_Extraction__Camera_Ready__Deadline__30th_June_2021_Pages__4.pdf

Additional details

Funding

NewsEye – NewsEye: A Digital Investigator for Historical Newspapers 770299
European Commission