Published January 31, 2023 | Version 1
Dataset Open

writer stance dataset

Authors/Creators

  • 1. University of Zurich,

Description

Repository: writer stance

- 1000 German texts from the X-stance corpus (see References X-stance) annotated for writer stance
- two formats: conll and BIO (sort of) - see below
- goal: explicit and implicit writer stance wrt. to entities (nouns) and (some) events (verbs)
  - implicit and explicit  are not distinguished (no indication which case was annotated)

- labels:      BIO  Conll
 - in favour: PRO    p
  - against:   CON    c
  - neutral:   O     no label
- 3 Annotations (see References DeInStance)

Formats:

conll: (parzu parser)

1   Das die ART ART Def|Neut|Nom|Sg 2   det _   _
p2  Arbeitsgesetz   Arbeitsgesetz   N   NN  Neut|Nom|Sg 3   subj    _   _
3   regelt  regeln  V   VVFIN   3|Sg|Pres|Ind   0   root    _   _
4   die die ART ART Def|Fem|Acc|Pl  5   det _   _
5   Arbeitszeiten   Arbeitszeit N   NN  Fem|Acc|Pl  3   obja    _   _
6   und und KON KON _   3   kon _   _
7   schützt schützen    V   VVFIN   _|_|Pres|Ind    6   cj  _   _
8   den die ART ART Def|Masc|Acc|Sg 9   det _   _
p9  Arbeitnehmer    Arbeitnehmer    N   NN  Masc|Acc|Sg 7   obja    _   _
10  .   .   $.  $.  _   0   root    _   _

head was annotated directly in front of the index (1. column)
e.g. "p2  Arbeitsgesetz   Arbeitsgesetz   N   NN  Neut|Nom|Sg 3   subj    _   _"
i.e. the writer is in favour of "Arbeitsgesetz"

BIO: wordform lemma label

Das die O
Arbeitsgesetz Arbeitsgesetz PRO
regelt regeln O
die die O
Arbeitszeiten Arbeitszeit O
und und O
schützt schützen O
den die O
Arbeitnehmer Arbeitnehmer PRO
. . O

head was annotated (last column)
e.g. "Arbeitsgesetz Arbeitsgesetz PRO"


for X-stance corpus see also https://vamvas.ch/more-general-stance-detection-with-x-stance


References:

@article{X-stance,
  author    = {Jannis Vamvas and
               Rico Sennrich},
  title     = {X-Stance: {A} Multilingual Multi-Target Dataset for Stance Detection},
  journal   = {CoRR},
  volume    = {abs/2003.08385},
  year      = {2020},
  url       = {https://arxiv.org/abs/2003.08385},
  eprinttype = {arXiv},
  eprint    = {2003.08385},
  timestamp = {Tue, 24 Mar 2020 16:42:29 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2003-08385.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

@inproceedings{DeInStance,
       booktitle = {17th Conference on Natural Language Processing (KONVENS)},
           month = {September},
           title = {DeInStance: Creating and Evaluating a {G}erman Corpus for Fine-Grained Inferred Stance Detection},
          author = {Anne Gohring and Manfred Klenner and Sophia Conrad},
       publisher = {ACL Anthology},
            year = {2021},
           pages = {213--217},
        language = {english},
             url = {https://doi.org/10.5167/uzh-207940},
        abstract = {We introduce deInStance, a corpus of 1000 politicians? answers in German (de) containing sentences labeled with explicitly expressed and inferred stances - pro and con relations - by 3 annotators. They achieved an acceptable inter-rater agreement given the inherent subjective nature of the task. A first baseline, a fine-tuned BERT-based token classifier, achieved F1-scores of around 70\% . Our focus is on the difficult subclass of sentences comprising only non-polar words, but still with an (implicit) pro or con perspective of the writer.}
}

Files

Files (1.6 MB)

Name Size Download all
md5:64ca764d7af080bfe42f850546499ee6
1.6 MB Download