Published June 7, 2021 | Version 1.0
Dataset Open

Old Literary Finnish aligned word embeddings

  • 1. RootRoo
  • 2. University of Helsinki

Description

This repository contains aligned word embeddings for four centuries of Old Literary Finnish. The original data source is the Corpus of Old Literary Finnish:

> Institute for the Languages of Finland (2013). Corpus of Old Literary Finnish [text corpus]. The Language Bank of Finland. Retrieved from http://urn.fi/urn:nbn:fi:lb-201407165

These are simple word2vec embeddings intended to be used as a baseline for research on semantic change. They will be updated when more material becomes available. To the date the authors of these embeddings have not been able to study semantic change with them, but detecting the threshold for such analysis is also an important target of investigation.
 

Files

README.md

Files (46.8 MB)

Name Size Download all
md5:40e7fe552521ae6bd55cea7ae4f43a2f
752 Bytes Preview Download
md5:38c23a8d6c8adf63bdfc99b8c56c056a
9.3 MB Preview Download
md5:ffdcbc60849b8dbaca10058e47ec8d04
15.3 MB Preview Download
md5:9f3dc5e049981d6019b06605c8d8bbb8
14.7 MB Preview Download
md5:9a8e6ac754a51a178895e8a7e7188f97
7.4 MB Preview Download

Additional details

References