The Manzini & Savoia (2005) Corpus: Morphosyntactic Variation in Italian and Romansh Dialects
Creators
-
Savoia, Leonardo Maria
(Project manager)1
-
Manzini, Maria Rita
(Data manager)1
-
Mazzaggio, Greta
(Project manager)1
-
Luca Andrea, Ludovico
(Data curator)2
-
Vena, Mael Vittorio
(Researcher)2
- Zoli, Carlo (Data curator)3
- Baldi, Benedetta (Project member)
-
Franco, Ludovico
(Project member)1
-
Binazzi, Neri
(Project leader)1
- 1. Università degli Studi di Firenze
- 2. Università degli Studi di Milano
- 3. Libera Università di Bolzano
Description
Corresponding Author
Mazzaggio, Greta (greta.mazzaggio@unifi.it)
Abstract
This dataset consists of linguistic examples of Italian morphosyntactic microvariation documented in the three volumes: Manzini M.R., Savoia L.M. (2005), I dialetti italiani e romanci. Morfosintassi generativa, Alessandria, Edizioni dell’Orso.
These data are also accessible from the following link: https://manzinisavoia.changes.unifi.it/
Dataset content
The dataset consists of a corpus of linguistic examples illustrating microvariation in Italian dialects, compiled by Manzini and Savoia (2005). It includes data from 457 Italian dialectal varieties, 9 Corsican varieties, and 19 Swiss varieties, all collected through field research and annotated using the International Phonetic Alphabet (IPA). The corpus contains a total of 64,472 linguistic examples, each consisting of a dialectal sentence transcribed in IPA along with its Italian gloss. For each linguistic example, additional metadata is provided in the header line of the CSV file, using self-explanatory field names:
- Locality_official_name: Official name of the municipality or village.
- Locality_short: Shortened version of the municipality or village name.
- Chapter_title: Title of the chapter in Manzini & Savoia (2005) where the example appears.
- Chapter_subtitle: Title of the subchapter in Manzini & Savoia (2005) where the example appears (if applicable).
- Example: Dialectal sentence transcribed in IPA.
- Glossa: Italian gloss of the example.
Data are in CSV format. For further information about data acquisition and digitization, please refer to the publication below. In order to correctly view the examples in IPA, it is essential to use a text editor or software that supports UTF-8 encoding.
The dataset consists of a single file of approximately 3.5 MB.
Terms of use
This work has been supported by funding from the Italian Ministero dell’Università e della Ricerca and from the European Union (PNRR - PE05 CHANGES CUP B53C22004010006).
The dataset is open access for scientific research and non-commercial purposes.
The authors require to acknowledge their work and, in case of scientific publication, to cite the following works:
- Manzini M.R., & Savoia L.M. (2005). I dialetti italiani e romanci. Morfosintassi generativa. Alessandria, Edizioni dell’Orso.
- Mazzaggio, G., & Binazzi, N. (2024). Valorizzare il patrimonio immateriale: un’esperienza di digitalizzazione del dialetto. DILEF. Rivista digitale del Dipartimento di Lettere e Filosofia, 3, pp. 224-242. https://doi.org/10.35948/DILEF/2024.4348
- Mazzaggio, G., Ludovico, L. A., Vena, M. V., Manzini, M. R., & Savoia, L. M. (2023). Morphosyntax of Italian and Romance Varieties: Presentation of the Manzini and Savoia (2005) Corpus and Its Digitalization. Bollettino dell’Atlante Linguistico Italiano, 2023(47), 185-210.
Files
M-S_2005_2024-01-31_v1.0.csv
Files
(3.5 MB)
Name | Size | Download all |
---|---|---|
md5:e6c1c1234b98ba2576ff0aea95ebd700
|
3.5 MB | Preview Download |
Additional details
Related works
- Is described by
- Publication: 10.35948/DILEF/2024.4348 (DOI)