Published July 25, 2022 | Version v1
Conference paper Open

S2ORC-SemiCause: Annotating and analysing causality in the semiconductor domain

  • 1. Know-Center GmbH
  • 2. Know-Center GmbH, Graz University of Technology
  • 3. University of Klagenfurt, Infineon AT
  • 4. Graz University of Technology

Description

This work presents the S2ORC-SemiCause benchmark dataset. It is based on the S2ORC corpus, which has been filtered for literature on semiconductor research, and consecutively annotated by humans for causal relations. The resulting dataset differs from existing causality datasets of other domain in the long spans of causes and effects, as well as causal cue phrases exclusive to the domain semiconductor research. As a consequence, this novel dataset poses challenges even for state-of-the-art token classification models such as S2ORC-SciBERT. It thus serves as benchmark for causal relation extraction for the semiconductor domain.

Files

ie_causal_EAI4IA_workshop.pdf

Files (332.6 kB)

Name Size Download all
md5:eb1895dff156f546ba8087d2b99cb2e2
332.6 kB Preview Download