Sound Comparisons: Germanic
Authors/Creators
- 1. Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin
- 2. DLCE, Max Planck Institute for the Science of Human History, Jena, Germany
- 3. Linguistics and English Language, University of Edinburgh
- 4. Technology Management, FAU Erlangen-Nuremberg
- 5. Dept of German, Nordic and Slavic, University of Wisconsin-Madison
Description
SOUND COMPARISONS: GERMANIC
Sound Comparisons: Germanic is a free online resource for exploring the diversity in pronunciations across the Germanic language family.
At https://soundcomparisons.com/Germanic, just tap, click or hover the mouse over the map to hear and compare instantaneously how the ‘same’ basic Germanic words are pronounced, similarly or differently, in the many regional dialects and languages within Germanic (a total of 13,075 word-recordings). The voices are those of native speakers of Germanic languages, as spoken in c. 120 different locations throughout the Germanic-speaking world. The focus is on those parts of Europe where Germanic languages are spoken (including of course Germany, the Netherlands, Britain and most of Scandinavia),
The geographical focus is on those parts of Europe where Germanic languages are spoken (including of course Germany, the Netherlands, Britain and most of Scandinavia), and especially on those regions where rich dialectal diversity still survives — and indeed may urgently need to be recorded for posterity, where it may soon go extinct. Coverage also extends to where Germanic languages have spread further afield, in Europe and far beyond, especially where those varieties differ significantly in pronunciation from standard pronunciations in Europe: e.g. Transylvanian ‘Saxon’, Pennsylvania ‘Dutch’, Pomeranian in Wisconsin, Afrikaans, and varieties of English worldwide.
Sound Comparisons: Germanic focuses on 106 words selected as ideal Germanic cognates, i.e. words that stem from the same origin in all Germanic languages, but can differ significantly in pronunciation from region to region. The word daughter, for example, is pronounced (shown here using phonetic transcription) as [ˈtʰɒχtɐ], [ˈdɔχtəɹ], [ˈdɒχtər̥], [ˈdʊɪʃtɐ], [ˈdɛ̝ɾʌ̈] and [ˈdɔʰtɪɾ̥] in the Sound Comparisons recordings of standard German, standard Dutch, West Frisian, Luxembourgish, Danish and Icelandic respectively — as just a few examples. Or across various accents of English, daughter is [ˈdɔ̞xtɐ̠ɾ̥] in traditional ‘Doric’ Scots, [ˈdɑˑɔɾɹ̩] in South Carolina and [ˈdɔːtʔ͡ɐ] in Newcastle upon Tyne.
The https://soundcomparisons.com/Germanic website is a user-friendly resource for speakers of the Germanic languages to explore how the different branches of the Germanic family relate to each other, to help also helps explain their origins and history. (Germanic can also be compared with words ‘cognate’ even more deeply, across hundreds of languages and dialects in the Romance, Celtic and Slavic families, at https://soundcomparisons.com/Europe.) The website is intended also for both teachers and learners of Germanic languages, and can be viewed in several different user languages.
For linguists, Sound Comparisons is a highly customisable research tool. It offers powerful, linguistically-informed search and filter functionality, and the ability to cite and download targeted sets of both sound files and phonetic transcriptions. For almost all of its 13,075 individual word recordings, Sound Comparisons: Germanic also provides a close phonetic transcription in the International Phonetic Alphabet.
For more details on Sound Comparisons: Germanic, and the wider Sound Comparisons framework, see:
- The https://soundcomparisons.com/Germanic website, including the ‘how to cite’ and contributors pages.
- The overall https://soundcomparisons.com site, including the history of the project and its funding.
- The first, brief Sound Comparisons launch publication:
Heggarty, Paul, Aviva Shimelman, Giovanni Abete, Cormac Anderson, Scott Sadowsky, Ludger Paschen, Warren Maguire, Lechoslaw Jocz, María José Aninao, Laura Wägerle, Darja Dërmaku-Appelganz, Ariel Pheula do Couto e Silva, Lewis C. Lawyer, Jan Michalsky, Ana Suelly Arruda Câmara Cabral, Mary Walworth, Ezequiel Koile, Jakob Runge & Hans-Jörg Bibiko. 2019. Sound Comparisons: A new online database and resource for researching phonetic diversity. Proceedings of the 19th International Congress of Phonetic Sciences, p.280–4. Canberra, Australia: Australasian Speech Science and Technology Association.
https://icphs2019.org/icphs2019-fullpapers/pdf/full-paper_490.pdf
DATA FILES AND FORMATS
This Zenodo publication includes four data files: three data tables (all .tsv format, UTF8), and one .zip of sound files.
SndComp_Germanic_Languages.tsv
- A .tsv data-table of all 130 language varieties included in the Sound Comparisons: Germanic database.
- One record (table row) per language in the data‑set.
SndComp_Germanic_Words.tsv
- A .tsv data-table of all 106 words in the Sound Comparisons: Germanic database.
- One record (table row) per word in the data‑set.
- Strictly, ‘words’ here refers to cognates rather than to meanings. There are two different types of study within the Sound Comparisons framework, and Germanic is of the cognate-based type, not the meaning-based type. To ensure that comparisons of divergence in phonetics are valid, the ‘words’ here are cognates, i.e. derived from the same origin in the ‘Proto-Germanic’ common ancestor language. Some cognate words may over time shift and diverge in meaning in some language varieties, so these words do not necessarily mean exactly the same in all the different varieties of Germanic.
SndComp_Germanic_Transcriptions.tsv
- A .tsv data-table of all phonetic transcriptions in https://soundcomparisons.com/Germanic.
- One transcription record (table row) per word per language in the data‑set.
- Records also include word spellings for reference languages for which established orthographies do exist, although these are relatively few in comparison to the number of regional languages and dialects covered.
- Records also include a number of fields to cover cases where the word-form recorded in a given language variety is not straightforwardly equivalent on all levels to the word-forms in most other language varieties. It may for example be only partially cognate, because it has a different morphological structure; or the cognate word recorded may have a different meaning in this individual language variety; and so on.
SndComp_Germanic_SoundFiles.zip
- A .zip of all 13,075 sound files in https://soundcomparisons.com/Germanic, in .mp3 format. (Other formats, .ogg and .wav, are available on request by emailing Paul.Heggarty@gmail.com.)
- One (very short) .mp3 file per word per language in the data‑set.
- The zip file unzips as a folder structure within which each language variety has its own sub-folder. This sub-folder contains the c. 106 individual short sound files for that language variety.
CORRESPONDENCES BETWEEN TRANSCRIPTION RECORDS AND SOUND FILES
By default, each language variety has (at least) one sound file for each word. Also by default, for each of those sound files, the Transcriptions table (when complete) has one corresponding record (row), which includes an entry in the phonetic transcription field.
In some contexts, however, the default one-to-one correspondence does not apply.
- Sound files, but no corresponding phonetic transcription record. This arises most often simply where the work to produce a phonetic transcription has not yet been completed. This is usually when a language variety has only recently been recorded and added to the database. In such cases, recordings are usually uploaded to make the sound file available as soon as possible, pending the further extensive work to produce phonetic transcriptions. On the website, such cases show as a blue ‘play’ triangle in place of the phonetic transcription. Version 1.0 of Sound Comparisons: Germanic includes full transcriptions for all but 4 of the 130 language varieties covered.
- Records with a phonetic transcription, but no corresponding sound files. This arises most often for earlier historical stages of languages, such as Shakespearean English, or Middle High German. For these ‘historical’ languages, transcriptions of the assumed original pronunciations are provided for each word where possible (as far as this can be worked out with any confidence), but obviously there are no corresponding recording files.
- Records with a spelling entry only, and no entry in the phonetic transcription field, and no corresponding sound file. This arises in cases where a language variety has been entered as a spelling reference language only, not uniquely associated to any particular geographical variety of the language. This is so that Sound Comparisons can illustrate an established orthography for a language, and indeed these are typically intended as overarching compromise spelling systems that work for many regional pronunciation variants (using different ‘reading rules’).
- More than one sound file and corresponding phonetic transcription for the same single word in the same language variety. This can arise in two different types of case:
- In some language varieties, different variant pronunciations are possible for the same basic word. Where the native speaker of a language variety gave more than one variant, Sound Comparisons includes up to three of these. The corresponding sound files are distinguished by the addition of _pron2 or _pron3 to their filenames.
- In some language varieties, the native speaker provided multiple different words, often because one is the true cognate with the basic word in most other language varieties, but the other variant is actually more common in the default meaning. The sound files are distinguished by the addition of _lex2 or _lex3 to their filenames.
- Records with a ‘no-data’ entry .. in the phonetic transcription field. In Sound Comparisons, two dots .. are entered in the phonetic transcription field to show that no recording could be made for this word in this language variety. (This explicit entry helps distinguish this from cases where a recording is pending, but has not yet been made or uploaded.) Missing data usually arise because the target native (cognate) word has simply been lost from that particular language variety. In some regional language varieties within Germanic, for example, original native words have been replaced by loanwords from the dominant national standard language. Otherwise, individual words may simply have become relatively rare, and may no longer be recalled by the last speakers in regions where a regional language is dying out and has not been widely used for decades.
VERSION INFORMATION: 1.0
In this version 1.0 of Sound Comparisons: Germanic, phonetic transcriptions are generally provided for all 106 words in almost all recorded language varieties. As of this version 1.0, most of these are first draft phonetic transcriptions, to be revised for consistency and standardisation in later version releases. Transcriptions are not yet available, however, for 4 language varieties only recently recorded: Ukraine Yiddish, East Frisian, and the English of Nova Scotia and of Pittsburgh.
Additional data fields — on cognacy, on morphological structure and on meaning differences between some cognates — are provided so far only for some words and languages, to be completed in later version releases, along with the phonetic transcriptions still outstanding.
KEYWORDS
Germanic, phonetics, languages, dialects, English, German, Dutch, Afrikaans, Frisian, Flemish, Luxembourgish, Danish, Swedish, Norwegian, Icelandic, Faroese, comparative linguistics, historical linguistics, diversity, database, recordings.
FUNDING
For information on funding and institutional support for Sound Comparisons: Germanic, see:
https://soundcomparisons.com/#/about/About-Sound-Comparisons:--Funding
LICENSE: CC BY-NC-ND 4.0
Sound Comparisons content is licensed under a:
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0)
https://creativecommons.org/licenses/by-nc-nd/4.0
CONTRIBUTIONS AND HOW TO CITE
Paschen, Ludger, Paul Heggarty, Warren Maguire, Jan Michalsky, Darja Dërmaku-Appelganz & Matthew Boutilier. 2019. Sound Comparisons: Germanic.
https://soundcomparisons.com/Germanic
http://doi.org/10.5281/zenodo.3596072
For details on the contributions of individual authors, other contributors and further citation information, see:
Notes
Files
Additional details
Related works
- Is part of
- 10.5281/zenodo.3595670 (DOI)