Published March 1, 2019 | Version pre-print
Book chapter Open

Linguistic Bias in Crowdsourced Biographies A Cross-lingual Examination

  • 1. Open University of Cyprus & RISE
  • 2. University of Nicosia

Description

Biographies make up a significant portion of Wikipedia entries and are a source of information and inspiration for the public. We examine a threat to their objectivity, linguistic biases, which are pervasive in human communication. Linguistic bias, the systematic asymmetry in the language used to describe people as a function of their social groups, plays a role in the perpetuation of stereotypes. Theory predicts that we describe people who are expected – because they are members of our own in-groups or are stereotype-congruent – with more abstract, subjective language, as compared to others. Abstract language has the power to sway our impressions of others as it implies stability over time. Extending our monolingual work, we consider biographies of intellectuals at the English- and Greek-language Wikipedias. We use our recently introduced sentiment analysis tool, DidaxTo, which extracts domain-specific opinion words to build lexicons of subjective words in each language and for each gender, and compare the extent to which abstract language is used. Contrary to expectation, we find evidence of gender-based linguistic bias, with women being described more abstractly as compared to men. However, this is limited to English-language biographies. We discuss the implications of using DidaxTo to monitor linguistic bias in texts produced via crowdsourcing.

Files

LingBiasWiki.pdf

Files (453.8 kB)

Name Size Download all
md5:7924295d13272a2f61fba3166efc6e2b
453.8 kB Preview Download

Additional details

Related works

Is new version of
10.1142/9789813274884_0012 (DOI)

Funding

CyCAT – Cyprus Center for Algorithmic Transparency 810105
European Commission
RISE – Research Center on Interactive Media, Smart System and Emerging Technologies 739578
European Commission