Published September 12, 2019 | Version v1
Conference paper Open

Clustering Ideological Terms in Historical Newspaper Data with Diachronic Word Embeddings

  • 1. University of Helsinki, Helsinki, Finland
  • 2. University of Tampere, Tampere, Finland

Description

During the course of the nineteenth century, ideological language mostly expressed through isms such as liberalism, socialism or conservatism, entered the lexicon in most European languages. Previous research has based on reading key texts claimed that the suffix ism was introduced to new linguistic domains during the period up to WWI, many of which do not relate to ideology. This paper uses a data-driven way to study the emergence of isms in nineteenth-century Finnish newspapers and uses word embeddings to cluster them and to trace their thematic expansion in the period. As such, the study provides a quantitatively sound way of tracking how isms relate to ideological language and more generally contributes to the understanding of the development of political language in Finland.

Files

paper_4.pdf

Files (777.5 kB)

Name Size Download all
md5:d1a95e3a6ab1f96e119c18dfb70735c0
777.5 kB Preview Download

Additional details

Funding

EMBEDDIA – Cross-Lingual Embeddings for Less-Represented Languages in European News Media 825153
European Commission