There is a newer version of the record available.

Published February 13, 2026 | Version 2.0.1
Dataset Open

ParaFin: Finnish Paradigms in Phonemic Notation

Authors/Creators

  • 1. Université Paris Cité
  • 2. ROR icon Laboratoire de Linguistique Formelle
  • 3. ROR icon Centre National de la Recherche Scientifique

Description

ParaFin is a collection of Finnish nominal paradigms in phonemic and orthographic notation, based on the Omorfi parser. They are suited for both computational and manual analysis. The dataset conforms to the Paralex standard.

Files

2.0.1.zip

Files (16.1 MB)

Name Size Download all
md5:96c24b2401b7dce13dfd1681a0188f47
16.1 MB Preview Download

Additional details

Related works

Is compiled by
Software: https://gitlab.com/finnic-morpho/parafin (URL)
Is derived from
Software: https://github.com/flammie/omorfi (URL)
Dataset: https://osf.io/7hrbv/ (URL)
Is identical to
Other: https://parafin.finug.eu (URL)

Software

Repository URL
https://gitlab.com/finnic-morpho/parafin
Programming language
Python
Development Status
Active

References

  • Tommi A. Pirinen, Inari Listenmaa, Ryan Johnson, Francis M. Tyers, and Juha Kuokkala. Open morphology of Finnish. University of Helsinki, 2017.
  • Tommi A Pirinen. Development and Use of Computational Morphology of Finnish in the Open Source and Open Science Era: Notes on Experiences with Omorfi Development. SKY Journal of Linguistics, 28:381–393, 2015.
  • Sami Itkonen, Tuomo Häikiö, Seppo Vainio, and Minna Lehtonen. LASTU: A psycholinguistic search tool for Finnish lexical stimuli. Behavior Research Methods, 56(6):6165–6178, 2024. doi:10.3758/s13428-024-02347-x.
  • Juhani Luotolahti, Jenna Kanerva, Veronika Laippala, Sampo Pyysalo, and Filip Ginter. Towards Universal Web Parsebanks. In Joakim Nivre and Eva Hajičová, editors, Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015), 211–220. Uppsala, Sweden, 2015. Uppsala University, Uppsala, Sweden.
  • David R. Mortensen, Siddharth Dalmia, and Patrick Littell. Epitran: Precision G2P for Many Languages. In Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga, editors, Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2711–2714. Miyazaki, 2018. European Language Resources Association.