Published January 31, 2023 | Version 1
Dataset Open

Sentences with negative actors: negative strength quantified

  • 1. University of Zurich,

Description

Files: data1.xml,data3.xml,data3.xml (3 annotators) - XML validiert

- 439 sentences 
  - target: a negative cause (an actor etc.) represented by the Lemma
  - id: sentence number
  - string: the plain sentence
  - strength: negativity strength of the target
    - labels 0-3
      - 0 no negative entity found (or parsing error)
      - 1 slightly negative, 2 negative, 3 stronly negative
  
- 115 out of 439 sentences with tag 0: i.e. sentences do not contain a negative actor
  - different reasons (see the paper below): modal, future tense etc. but also parsing errors


Data source: Facebook posts of the AfD, a German right-wing party

Examples:

no actor here: passive voice
<sent><id>1</id><target>Junge</target><strength>0</strength><string>"Verletzt wurde auch ein 11-jähriger Junge . "</string></sent>
stronly negative:
<sent><id>411</id><target>Euro</target><strength>3</strength><string>"Der Euro ruiniert Europa . "</string></sent>
negative:
<sent><id>214</id><target>Merkel</target><strength>2</strength><string>"Merkel verantwortet zusätzliche 50 Milliarden Sozialkosten bis 2018 . "</string></sent>
slightly negative:
<sent><id>154</id><target>Meuthen</target><strength>1</strength><string>"Meuthen schadet der Partei . "</string></sent>

References:

@inproceedings{nodalida,
           month = {Juni},
          author = {Manfred Klenner and Anne G{\"o}hring and Sophia Conrad},
       booktitle = {Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)},
         address = {Reykjavik, Iceland},
           title = {Getting Hold of Villains and other Rogues},
       publisher = {Virtual Event},
           pages = {435--439},
            year = {2021},
        language = {english},
             url = {https://doi.org/10.5167/uzh-204265},
        abstract = {In this paper, we introduce the first corpus specifying negative entities within sentences. We discuss indicators for their presence, namely particular verbs, but also the linguistic conditions when their prediction should be suppressed. We further show that a fine-tuned Bert-based baseline model outperforms an over-generating rule-based approach which is not aware of these further restrictions. If a perfect filter were applied, both would be on par.}
}
 

Files

Files (87.6 kB)

Name Size Download all
md5:56df7d76594550b0ceebd2d83f6f0ea8
87.6 kB Download