Published July 25, 2022
| Version v1
Conference paper
Open
S2ORC-SemiCause: Annotating and analysing causality in the semiconductor domain
- 1. Know-Center GmbH
- 2. Know-Center GmbH, Graz University of Technology
- 3. University of Klagenfurt, Infineon AT
- 4. Graz University of Technology
Description
This work presents the S2ORC-SemiCause benchmark dataset. It is based on the S2ORC corpus, which has been filtered for literature on semiconductor research, and consecutively annotated by humans for causal relations. The resulting dataset differs from existing causality datasets of other domain in the long spans of causes and effects, as well as causal cue phrases exclusive to the domain semiconductor research. As a consequence, this novel dataset poses challenges even for state-of-the-art token classification models such as S2ORC-SciBERT. It thus serves as benchmark for causal relation extraction for the semiconductor domain.
Files
ie_causal_EAI4IA_workshop.pdf
Files
(332.6 kB)
Name | Size | Download all |
---|---|---|
md5:eb1895dff156f546ba8087d2b99cb2e2
|
332.6 kB | Preview Download |