Published October 4, 2021 | Version v1
Conference paper Open

Token-Level Multilingual Epidemic Dataset for Event Extraction

  • 1. University of La Rochelle, L3i, F-17000, La Rochelle, France
  • 2. Sorbonne University, Paris, France
  • 3. University of Innsbruck, Innsbruck, Austria
  • 4. Multimedia University, Nairobi, Kenya

Description

In this paper, we present a dataset and a baseline evaluation for multilingual epidemic event extraction. We experiment with a multilingual news dataset which we annotate at the token level, a common tagging scheme utilized in event extraction systems. We approach the task of extracting epidemic events by first detecting the relevant documents from a large collection of news reports. Then, event extraction (disease names and locations) is performed on the detected relevant documents. Preliminary experiments with the entire dataset and with ground-truth relevant documents showed promising results, while also establishing a stronger baseline for epidemiological event extraction.

Files

Mutuvi2021_Chapter_Token-LevelMultilingualEpidemi.pdf

Files (164.1 kB)

Additional details

Funding

European Commission
EMBEDDIA - Cross-Lingual Embeddings for Less-Represented Languages in European News Media 825153