There is a newer version of this record available.

Dataset Open Access

SemFi - Finnish Semantic Database with Syntactic Relations

Hämäläinen, Mika

SemFi is a semantic database for Finnish in which the words are linked to each other by the syntactic relations and their frequency in a big corpus.

SemFi is based on the syntactic bigrams of The Finnish Internet Parsebank provided by Turku University.

The semfi.db file is an SQLite database and it is the one that should be used. The results_json.zip is mainly intended for those who are interested in working with SemUr which is a translated version of SemFi.

The previous version of this dataset has successfully been used in the hard AI task of creating Finnish poetry automatically. That data still powers the computationally creative system, Poem Machine.

More information and an online UI to browse the data is available on https://mikakalevi.com/semfi/.

Files (6.1 GB)
Name Size
results_json.zip
md5:40fbc866e41dfbf1714a3db8053c9e07
281.0 MB Download
semfi.db
md5:c7c906ff1d76fe6eb81f537fb0b86708
5.8 GB Download
semfyier.py
md5:519ef7a2675b24eb422e12519d5d46f4
7.9 kB Download
1,182
221
views
downloads
All versions This version
Views 1,18265
Downloads 22127
Data volume 416.2 GB47.7 GB
Unique views 97453
Unique downloads 14517

Share

Cite as