Conference paper Open Access

Word Clustering for Historical Newspapers Analysis

Lidia Pivovarova; Jani Marjanen; Elaine Zosa


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3402940", 
  "language": "eng", 
  "title": "Word Clustering for Historical Newspapers Analysis", 
  "issued": {
    "date-parts": [
      [
        2019, 
        9, 
        12
      ]
    ]
  }, 
  "abstract": "<p>This paper is a part of a collaboration between computer scientists and historians aimed at development of novel methods for historical newspapers analysis. We present a case study of ideological terms ending with -ism suffix in nineteenthcentury<br>\nFinnish newspapers. We propose a two-step procedure to trace differences in word usages over time: training of diachronic embeddings on several time slices and when clustering embeddings of&nbsp;selected words together with their neighbours<br>\nto obtain historical context. The obtained&nbsp;clusters turn out to be useful for historical studies. The paper also discusses<br>\nspecific difficulties related to development of historian-oriented tools.</p>", 
  "author": [
    {
      "family": "Lidia Pivovarova"
    }, 
    {
      "family": "Jani Marjanen"
    }, 
    {
      "family": "Elaine Zosa"
    }
  ], 
  "id": "3402940", 
  "event-place": "Varna Bulgaria", 
  "type": "paper-conference", 
  "event": "Language Technology for Digital Historical Archives (Workshop collocated with RANLP 2019) (LT-DHA 2019)"
}
159
92
views
downloads
All versions This version
Views 159159
Downloads 9292
Data volume 83.6 MB83.6 MB
Unique views 150150
Unique downloads 8888

Share

Cite as