Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published October 4, 2019 | Version 1.0.0
Dataset Open

FinMeter models

  • 1. University of Helsinki

Description

This contains data files needed for FinMeter.

This data is complementary for FinMeter Python library described in:

Mika Hämäläinen and Khalid Alnajjar (2019). Let's FACE it. Finnish Poetry Generation with Aesthetics and Framing. In the Proceedings of The 12th International Conference on Natural Language Generation.

 

 

Sources:

The pretrained vectors for Finnish (es - I know) and English (en) are from E. Grave, P. Bojanowski, P. Gupta, A. Joulin, T. Mikolov, Learning Word Vectors for 157 Languages . Creative Commons Attribution-Share-Alike License 3.0. See https://fasttext.cc/docs/en/crawl-vectors.html

The word2vec model trained on the Finnish Internet ParseBank is from Kanerva, Jenna; Luotolahti, Juhani; Laippala, Veronika; Ginter, Filip: Syntactic N-gram Collection from a Large-Scale Corpus of Internet Finnish. Proceedings of the Sixth International Conference Baltic HLT. 2014. paper.  Creative Commons Attribution-ShareAlike 4.0 International License. See http://bionlp.utu.fi/finnish-internet-parsebank.html

The Finnish concreteness data has been automatically translated from Brysbaert, Marc, Amy Beth Warriner, and Victor Kuperman. "Concreteness ratings for 40 thousand generally known English word lemmas.Behavior research methods 46.3 (2014): 904-911. Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. see http://crr.ugent.be/archives/1330

Files

fi_concreteness.txt

Files (10.2 GB)

Name Size Download all
md5:d72ddc55d7f32e26dcb11e2f2b5c138d
5.4 GB Download
md5:4c1d1570e1f7456f3a48d92868f0fa62
1.5 GB Download
md5:836745563679b08550de13bb7713e227
1.8 MB Preview Download
md5:882670227a07af80d23852f9051b61cf
2.7 GB Download
md5:549ef9dfec64d5e6febedcf7e19ba1f3
663.7 MB Download
md5:40199a8b76838f5faaf295f1832dd747
801.5 kB Preview Download

Additional details

Related works

Is supplemented by
Dataset: 10.5281/zenodo.3473449 (DOI)