Published October 22, 2024 | Version 0.1
Dataset Open

A Digitized Historical Database of Cultural, Economic, Demographic, and Ecological Data of Finland (CEDEDA)

Description

This package includes historical data from Finland describing spatial variation in culture, environment, demography and ecology in the 1600-1900s. We have divided the data into categories of culture (e.g. boat types or wedding rituals), environment (e.g. rainfall), demography (e.g. population numbers) and ecology (e.g. soil type). Note that some of the data may fit to several categories and be for example cultural-environmental (e.g. amount of farmed field). 

The package includes both binary and continous data. We provide (1) data files (Binary_Data.csv, Continous_Data.csv) that consist of thematic attributes and names, codes and coordinates of municipalities, (2) explanation files (Binary_Data_Explanations.csv and Continous_Data_Explanations.csv) that provide information where each variable has been collected from, an approximate timing of the variable and for some variables also their typological classification and (3) a polygon file (Municipality_polygons_UTF8) that describes the boundaries of municipalities. 

The original data comes from historical statistical yearbooks, historical atlases and geographical databases that were mostly collected to create a presentation of the history of the Finnish nation. Some of the unchanging physical information (e.g. soil types) are based on modern geographical databases. See the references for data sources below and in the explanation files.

The data has been digitized at the University of Turku by the BEDLAN research team (Biological Evolution and the Diversification of Languages). Datasets have been digitized by Ilpo Tammi and Sofia Koskela in supervision of Terhi Honkola and Timo Rantanen, and later described and prepared for publication by Paavo Jordman in supervision of Jenni Santaharju in the Human diversity consortium.

The coordinates of municipalities were calculated by Honkola et al. (2018) based on the locations of the largest or main settlement in each municipality at the turn of the 19th and 20th centuries (settlement data, Harju 1920). The polygon data represent Finnish municipality (or parish) boundaries in the 1920’s. The digitalization was done in Syrjänen et al. (2016), mainly based on the facsmile of Suomen karttakirja 1920 (Harju 2009) and the Atlas of Finland 1925 (Geographical Society of Finland, 1928). Since we provide data per municipality, and give also codes, coordinates and polygon data for municipalities, the data is interoperable with the digitized Dialect atlas of Finnish (Santaharju et al. 2025) and data from previous research (Syrjänen et al. 2016; Honkola et al. 2018, Lynch et al. 2022, Santaharju et al. in revision and Rantanen, Santaharju et al. manuscript) as well as other spatial data over Finland. The continuous cultural and environmental data from has been used by Honkola et al. (2018), continous environmental data by Lynch et al. (2022) and the binary cultural data by Rantanen, Santaharju et al. (manuscript).

This CEDEDA database is part of The Uralic Trove that is a digital data infrastructure of speaker areas of Uralic Languages and Finnish Dialects (Vesakoski et al. 2025). We aim also to add the data to the interactive user interface URHIA (Uralic Historical Atlas, Roose et al. 2025). The thorough data description will be in Jordman et al. (in preparation). 

 

Contact

Postdoctoral researcher Jenni Santaharju, University of Turku, jenni.santaharju@utu.fi

Associate Professor Outi Vesakoski, University of Turku, outi.vesakoski@utu.fi

 

How to cite?

Citation for the data: Jordman, Paavo; Terhi, Honkola; Timo Rantanen; Ilpo Tammi, Outi Vesakoski and Jenni Santaharju. "Data release: Digitized Historical Database of Cultural, Economic, Demographic, and Ecological Data of Finland (CEDEDA)". Manuscript in preparation.

Citation for the coordinates and polygons of municipalities: Santaharju, Jenni; Kaj Syrjänen; Terhi Honkola; Seppä Perttu; Outi Vesakoski and Unni-Päivä Leino. 2025. “Data release: Digitized Dialect Atlas of Finnish by Lauri Kettunen.” Digital Humanities in the Nordic and Baltic Countries Publication. DOI: 10.5617/dhnbpub.12270

 

Acknowledgements

The work has been funded by the Kone Foundation, the Research Council of Finland (grant no. 352727) and the Turku University Foundation.

 

References for data sources

Atlas of Finland 1925. 1928. The Geographical Society of Finland. Helsinki.

Atlas of Finland. 131: Climate. 1988.The Geographical Society of Finland. Helsinki.

Atlas of Finland. 141: Biogeography. 1988. The Geographical Society of Finland. Helsinki.

Atlas of the Finnish History. 200 7. Karttakeskus. Helsinki.

Harju, Erkki-Sakari. 2009. Suomen karttakirja 1920 = Kartboken över Finland 1920. Karttakeskus. Helsinki. 2. facsimile ed.

Talve, I. 1976. Suomen kulttuurirajoista ja -alueista. Finnish Academy of Science and Letters.

Important Landscape Areas; Report II of the working group on Iandscape areas. Minister of the Environment. 1992. http://hdl.handle.net/10138/29087.

Statistical yearbook of Finland. 1882. Central Statistical Office of Finland.

Yleinen katsaus väkiluvunmuutoksiin Suomessa vuosina 1882 ja 1883. 1885. Central Statistical Office of Finland.

Katsaus Suomenmaan taloudelliseen tilaan. Viisivuotiskausi 1881–1885. 1890. Central Statistical Office of Finland.

Kunnallinen verotus vuoden 1924 tuloista. 1928. Central Statistical Office of Finland.

Kunnallinen verotus vuoden 1927 tuloista. 1931. Central Statistical Office of Finland.

Kunnallinen verotus vuoden 1932 tuloista. Central Statistical Office of Finland. 1935.

Maataloustiedustelu Suomessa vuonna 1910. Maanviljelys. 1918. Direction of Agriculture.

Maataloustiedustelu Suomessa vuonna 1910. Karjanhoito. 1916. Direction of Agriculture.

Superficial deposits 1:1 000 000. Geological Survey of Finland 1972–2007.

Topographic Database. National Land Survey of Finland.

 

References for studies that used the data

Honkola, Terhi; Kalle Ruokolainen; Kaj Syrjänen; Unni-Päivä Leino; Ilpo Tammi; Niklas Wahlberg and Outi Vesakoski. 2018. “Evolution within a language: Environmental differences contribute to divergence of dialect groups.” BMC Evolutionary Biology. 18:132. DOI: 10.1186/s12862-018-1238-6

Lynch, Robert; John Loehr; Virpi Lummaa; Terhi Honkola; Jenni Pettay and Outi Vesakoski. 2022. “Socio-cultural similarity with host population rather than ecological similarity predicts success and failure of human migrations.” Proceedings of the Royal Society B, 289: 20212298. DOI: 10.1098/rspb.2021.2298

Rantanen, Timo; Jenni Santaharju; Michael Dunn; Harri Tolvanen; Elina Salmela; Unni Leino; Päivi Onkamo and Outi Vesakoski. "East-west division of genes, language and culture in Finland – drivers and hindrances." *Shared first authorship. Manuscript in preparation.

 

Other references

Kettunen, Lauri. 1940. “Suomen Murteet III A. Murrekartasto.” Suomalaisen Kirjallisuuden Seura. Helsinki.

Santaharju, Jenni; Kaj Syrjänen; Terhi Honkola, Perttu Seppä; Unni Leino and Outi Vesakoski. "Linguistic convergence and its drivers in Finnish dialects". Resubmitted revision in Language

Santaharju, Jenni; Kaj Syrjänen; Terhi Honkola; Seppä Perttu; Outi Vesakoski and Unni-Päivä Leino. 2025. “Data release: Digitized Dialect Atlas of Finnish by Lauri Kettunen.” Digital Humanities in the Nordic and Baltic Countries Publication. DOI: 10.5617/dhnbpub.12270

Syrjänen, Kaj; Terhi Honkola; Jyri Lehtinen; Antti Leino and Outi Vesakoski. 2016. "Applying population genetic approaches within languages: Finnish dialects as linguistic populations." Language Dynamics and Change 6.235-83. DOI: https://doi.org/10.1163/22105832-00602002.

Roose, Meeli; Tua Nylén; Petro Pesonen; Harri Tolvanen and Outi Vesakoski. 2025. “Uralic Historical Atlas (URHIA): Interactive Web App for Spatial Data”. Digital Humanities in the Nordic and Baltic Countries Publications 7 (3). DOI: https://doi.org/10.5617/dhnbpub.12261.

Vesakoski, Outi; Michael Dunn; Meeli Roose and Jenni Santaharju. 2025. “The Uralic Trove (UraLaari) – The Digital Data Infrastructure of Speaker Areas of Uralic Languages and Finnish Dialects”. Digital Humanities in the Nordic and Baltic Countries Publications 7 (3). DOI: 10.5617/dhnbpub.12266

Files

CEDEDA_v0_1_08052025.zip

Files (767.0 kB)

Name Size Download all
md5:c7b04d9ea1733064b6ed3780ffce6dbf
767.0 kB Preview Download

Additional details