Dataset Open Access
Ladder. A Corpus of Computer-Mediated Communication for the Analysis of the Acquisition of Pragmalinguistic Competences by German-Speaking Learners of Italian.
Many recent research projects (Artoni, Benigni, & Nuzzo, 2020; Cortés Velásquez & Nuzzo, 2017; Nuzzo & Cortés Velásquez, 2020) have underlined the usefulness of creating and analyzing corpora for teaching pragmatics, which, unlike other linguistic levels such as syntax, cannot be explained by rules but only by reference to tendential values or more or less appropriate choices in a given context. This is even more true for interactions via digital media, such as email and instant-messaging services, which have little place in manuals or L2 courses and for which learners have few reference models (Brocca, 2021; Trubnikova & Garofolin, 2020).
Data were collected from April 2020 to April 2021 with the help of a discourse completion task (DCT). The data consists of emails and instant messages. The informants are (i) German learners of Italian between A2-C1 level according to the CEFR and most of them are students living in Tyrol (Austria) and (ii) native speakers of Italian most of whom are students from Rome (Italy). The data of the learners were collected by students of the undergraduate seminar “Insegnare la pragmatica” which is part of the compulsory module 2b for student teachers at the Institute of Didactics of the University of Innsbruck. The data of the native speakers were collected in large part from students in foreign languages at the University RomaTre thanks to the collaboration with Prof. Elena Nuzzo.
The DCTs have been conducted with online questionnaires. Along with the texts, metadata were also registered with the help of an online questionnaire giving sociolinguistic information about the informant (age, self-assessed language level, place of residence, native language, etc.). The DCTs aim to elicit linguistic acts of request and refusal in increasing levels of social distance and different media (Taguchi & Roever, 2017, pp. 85, 231; Hinger et al. 2018: 148). The DCTs elicit different speech acts (requests and refusals) with different degrees of formality (study/work or free time), directed at different people (lecturer, friend, boss) and in different media (mail or instant messaging). The scenarios represent authentic circumstances for the students. The following table shows the situations that were studied:
high level of social distance between sender and recipient
Scenario 1: Sender is asking for something that he/she is not entitled to
Scenario 2: Sender is asking for something that he/she is entitled to
a) low level of social distance between sender and recipient
Scenario 1: Request
Scenario 2: Rejecting a request
Scenario 3: Short-notice cancellation of an invitation
b) medium level of social distance between sender and recipient
Scenario 4: Request
Scenario 5: Rejecting a request
Scenario 6: Short-term rejection of an invitation
The WhatsApp messages, which are exemplary of the text type instant messaging, were produced directly with the cell phone. The metadata were subsequently associated with the respective messages in an Excel spreadsheet. All personal data were anonymized.
The prompts were presented in Italian, as follows:
Mail a) Immagina di star facendo un corso con il Dr. Nicola Brocca. Domani devi fare una presentazione in classe. Non hai avuto tempo per studiare perché dovevi prepararti a un esame di inglese e ti accorgi che il materiale da presentare è più di quello che avevi previsto. Scrivi una mail al professore: la tua speranza è spostare la presentazione.
Engl: Imagine you are taking a course with Dr. Nicola Brocca. Tomorrow you have to give a presentation in class. You had no time to study because you had to prepare for an English exam, and you realize that there is more material to present than you had imagined. You write an email to the professor: your hope is to reschedule the presentation.
Mail b) Hai fatto un corso con il Dr. Brocca. Hai consegnato il tuo portfolio il 01.02.2020 adesso è il 01.03.2020 e non hai ancora ricevuto il voto. Ti serve il voto per registrarti per una borsa di studio. Manda una mail al prof.: il tuo obiettivo è ricevere il voto al più presto
Engl: You have taken a course with Dr. Brocca. You turned in your portfolio on 02/01/2020, it is now 03/01/2020 and you have not received the grade yet. You need the grade to register for a scholarship. Send an email to the professor: your goal is to receive the grade as soon as possible.
1. Sei in Erasmus in Italia. Avete creato una chat con 10 compagni di corso. Hai perso la tua tessera della biblioteca a vuoi chiedere se qualcuno ti può aiutare perché ti serve un libro entro domani...per esempio prestandoti la sua. Cosa scrivi?
Engl: You are taking part in the Erasmus program in Italy. You have created a chat with 10 classmates. You lost your library card and want to ask if someone can help you because you need a book by tomorrow.... E.g. by lending you their card. What do you write?
2. Ricevi questo messaggio da un amico/a che fa un seminario con te: "Ciao, sono a corto di tempo. Ho visto che hai preso 30 all'esame. Potresti darmi una mano e restare con me in biblioteca oggi?" Non vuoi aiutare il tuo amico. Come reagisci?
Engl: You receive this message from a friend who is attending a seminar with you: "Hello, I'm running out of time. I saw that you got a 30 on the exam. Could you help me and stay with me in the library today?" You don't want to help the friend. How do you respond?
3. Cinque giorni fa hai promesso ad un/a amico/a che questa sera sareste andati al cinema assieme. Però hai cambiato idea. Cosa fai? Cosa scrivi?
Engl: Five days ago, you promised a friend that tonight you would go to the movies together. But you changed your mind. What would you do? What do you write?
4. Sei al lavoro e hai smarrito il documento elettronico per entrare nel parcheggio. Sei nuovo in questo gruppo di lavoro e hai solo il numero del tuo diretto superiore. Gli mandi un messaggio per chiedergli se ti può aiutare.
Engl: You are at work and have lost your electronic badge to enter the parking lot. You are new to this work group and only have the number of your direct supervisor. You send him/her a message and ask if he/she can help you.
5. Ricevi questo messaggio dal/la tuo/a superiore. "Gentile collega, domani c'è una scadenza importante. Per caso sarebbe in grado di restare oggi in ufficio oltre l'orario?" Non vuoi restare in ufficio oltre il normale. Come reagisci?
Engl: You receive this message from your supervisor. "Dear colleague, tomorrow is an important appointment. Would you be able to stay in the office after hours today?" You don't want to stay in the office beyond normal working hours. How do you respond?
6. Cinque giorni fa hai promesso al/la tuo/a superiore che oggi saresti andato a una cena di lavoro. Però devi disdire. Cosa fai?
Engl: Five days ago, you promised your superior that you would go to a business dinner today. However, you have to cancel. What do you do?
The corpus, which was first collected in .xlsx format, was exported to XML format and CSV format in cooperation with Joseph Wang-Kathrein (Brenner Archive Research Center). It was ensured that the emoticons and special characters were also transferred unchanged in the conversion process. These formats allow long-term archiving and significantly facilitate data exchange.
The size of the corpus (as of May 2021, version Ladder 1.0):
The LADDER corpus includes emails and instant-messaging messages amounting to 18,935 tokens and 33,966 tokens respectively. The corpus of WhatsApp messages consists of a total of 1,204 messages from 80 native speakers and 114 learners. The corpus of emails consists of a total of 235 emails from 78 native-speaker informants and 38 learners. The amount of data allows a qualitatively relevant comparison in sub-corpora e.g. language levels.
The size of the corpus is necessarily limited quantitatively, as data collection must be done manually through individual DCT management and metadata checking. The major bottleneck is currently the annotation of socio-pragmatic aspects, a process that is difficult to automate and that needs to be conducted through cross-annotation by multiple annotators.
Some students' works on the corpus have been collected and are accessible via the following link: https://ladder.hypotheses.org/
Artoni, D., Benigni, V., & Nuzzo, E. (2020), "Pragmatic instruction in L2-Russian: a study on requests and advice" in Instructed Second Language Acquisition, 4(1), 62-95. doi:10.1558/isla.39864
Brocca, N. (2021), "LADDER: La costruzione e analisi di un corpus di scritture digitali per l’insegnamento della pragmatica in L2" in Italiano Lingua Due, 13(1 (2021)).
Cortés Velásquez, D., & Nuzzo, E. (2017), "Disdire un appuntamento: spunti per la didattica dell'italiano L2 a partire da un corpus di parlanti nativi" in Italiano Lingua Due, 1, 17-36.
Hinger, B., Stadler, W., Schmiderer, K., Bauer, M., (Hrg.) (2018). Testen und Bewerten fremdsprachlicher Kompetenzen. Tübingen: Narr Francke Attempto Verlag.
Nuzzo, E., & Cortés Velásquez, D. (2020), "Canceling Last Minute in Italian and Colombian Spanish: A Cross-Cultural Account of Pragmalinguistic Strategies" in Corpus Pragmatics, 4, 1-26. doi:10.1007/s41701-020-00084-y
Taguchi, N., & Roever, C. (2017), Second language pragmatics: Oxford: Oxford University Press.
Trubnikova, V., & Garofolin, B. (2020), Lingua e interazione. Insegnare la pragmatica a scuola. Pisa: ETS.
Instant Messaging Corpus.csv