TRACES Hierarchical Classification of Categories of Linguistic and Psycholinguistic Markers of Deception with Bulgarian Expression Lists for Disinformation Detection
Description
These resources have been created within Project TRACES (more information: https://traces.gate-ai.eu/). The resources contain a hierarchical classification with 97 fine-grained and 18 coarse-grained categories of linguistic and psycholinguistic markers, signaling deception. The markers have been collected from related work (see the References section below) mostly on English language. Next to most categories, there are proposals for methods for detecting them, taking into account the specifics of Bulgarian language. As such, the classification can be adapted to other languages. The resource also contains lists of Bulgarian expressions, which have to be used for a look-up in the texts, in order to detect some of the categories of markers. One of the lists contains attention-attracting expressions, which have been collected from Bulgarian social media messages on Covid-19, but some of which are universal.
These resources can be used to identify disinformation, if considered as “false or misleading content that is spread with an intention to deceive or secure economic or political gain and which may cause public harm”, according to its definition by the Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee, and the Committee of the Regions on the European Democracy Action Plan.
For more information check our paper:
Irina Temnikova, Silvia Gargova, Ruslana Margova, Veneta Kireva, Ivo Dzhumerov, Tsvetelina Stefanova and Hristiana Nikolaeva (2023) New Bulgarian Resources for Detecting Disinformation. 10th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics (LTC'23). Poznań. Poland.