Published February 24, 2021 | Version v1
Presentation Open

Outcomes of the fifth DELAD Workshop held on 27-28 January 2021

  • 1. CLS/CLST, Radboud University, the Netherlands
  • 2. University College Cork, Ireland
  • 3. University of Groningen
  • 4. Adam Mickiewicz university in Poznan, Poland
  • 5. University of Helsinki, Finland
  • 1. University of Zagreb, Croatia
  • 2. Siena University, Italy
  • 3. Warsaw University, Poland
  • 4. FNRS & IRSTL, UMons, Belgium
  • 5. The Language Archive, MPI, the Netherlands
  • 6. Nederlands Kankerinstituut, the Netherlands
  • 7. CLS/CLST, Radboud University, the Netherlands
  • 8. University of Groningen

Description

What was the workshop about?

This workshop was the fifth of a series that started in 2015, and it was the third organized under the CLARIN umbrella. About 30 participants registered and attended the meeting The attendants came from all over Europe amongst others from the Netherlands, Finland, Ireland, Poland, Italy, France, Estonia and the UK, with backgrounds in language and speech pathology, linguistics and phonetics, speech technology, data archiving, ICT, and law. This is exactly the mix that makes DELAD attractive and suited for discussing and sharing CSD.

The workshop was organized by the DELAD steering group together with Esther Hoorn from. the CLARIN Legal and Ethical Issues Committee (CLIC) 

Goal

The aim of this workshop was to:

  • Extend DELAD network with new participants;  
  • Explore with the participants the potential of the new CLARIN K-Centre on Atypical Communication Expertise (ACE) for hosting CSD of DELAD members;
  • Exchange deeper insights on Data Protection Impact Assessments (DPIAs); 
  • Discuss voice conversion as a means to pseudonymize speech.

An overview of the workshop can be found here

Day 1

On the first day the workshop started  with four presentations:

  • “Croatian written and spoken corpora of speech with communication disorders” (Gordana Hržica)
  • “Oral and written documents of mental health patients” (Silvia Calamai & Rosalba Nodari)
  • “Using electromagnetic articulography for the purpose of studying speaking styles and speech disorders” (Katarzyna Klessa, Anita Lorenc, & Łukasz Mik)
  • “Parkinson’s disease: A French corpus collected using MonPaGe protocol” (Veronique Delvaux)

Apart from the discussions about the research itself considerations were made for sharing the resulting CSD. Especially those of Calamai & Nodari and Delvaux, which were still looking for a suitable shelter for the corpus where DELAD could play a role.

The afternoon session was exactly devoted to that topic. How can DELAD in cooperation the CLARIN Knowledge Centre for Atypical Communication Expertise assist in realizing the GDPR compliant access to such CSD?

  • “Help from DELAD and CLARIN Centre for Atypical Communication Expertise (ACE) in sharing CSD” (Henk van den Heuvel)
  • “How to access & deposit existing data at CLARIN centres, profiles of metadata, licenses” (Paul Trilsbeek)

Day 2

In the morning session of the second day of the workshop a role play was scheduled as devised by Esther Hoorn and her team. The role play led to lively discussion addressing various aspects of the ingredients needed to be taken into account when documenting your considerations in a Data Protection Impact Assessment (DPIA). Such a game is an entertaining way to touch upon various aspects that are relevant when sharing your CSD, leading to various eye openers in the discussions! Here, the breakout rooms in zoom served well to split the group into two.

In the afternoon, Rob van Son gave a keynote talk on “Use voice conversion for pseudonymisation?”. The intriguing idea behind the method he presented is on the one hand to retain linguistic & paralinguistic features of the speech, and on the other hand to remove the identity of the speaker. If that could be done successfully then the speech could count as anonymised and won’t be subject to  GDPR. Rob van Son presented results from the Voice Privacy Challenge 2020 and concluded that speaker identifying information can be removed from speech, but also noticed that all systems had issues with naturalness and intelligibility. A relevant issue for DELAD of course is if pathological speech, e.g. dysarthric speech, will still be studied after pseudonymisation. And for this further study evidence is needed. During the workshop several participants expressed interest in a case study for part of their material. 

Lessons learnt, points taken

It was great to have the support of CLARIN staff in organizing an online workshop like this via Zoom. CLARIN took care of the zoom addresses and mailings, and the breakout room for coffee breaks and the role play. This relieved us as workshop organizers from a serious organizational burden, so that we could concentrate on the content of the workshop.

We were satisfied with the long breaks for the lunch of two hours. It gave participants the opportunity to digest the content of the morning session, and do something (relaxing) in between. We also stopped quite early (at 16:00) to avoid our participants becoming “Zoombies” at the end of the day. This experience let us think about how to take advantage of online meetings in the future in addition to face-to-face meetings.

We were happy with the new researchers that subscribed to the workshop. Their presentations were very interesting and diverse. The workshop in general provided a good mix of research and data oriented presentations and presentations focusing on the legal and ICT  support that DELAD can offer to share such data. The role play on Data Impact Protection Assessment (DPIA) was a highly valued interactive aspect of the program. 

Action points

As relevant action points for our DELAD network we have identified:

  • Set up a number of case studies on pseudonymization of (multilingual) pathological speech data and organize a workshop around this; 
  • Look into ways to promote DELAD and its benefits for sharing CSD on national levels. Mentioned were a slide desk for promotion, social media campaign, folders);
  • Partners are interested in curating and sharing their corpora via DELAD;
  • This will yield further case studies on sharing CSD. DPIAs should be integrated;
  • Share Reference DPIA’s to the community via DELAD website;
  • Share consent form templates via DELAD website, how to structure them, relevant aspects (must haves), checklists with examples.
  • Topics for next workshops:
    • Progress on pseudonymization 
    • Sharing data via remote secure access option
    • Experiences on sharing datasets (make it more concrete)

Notes

If you are interested in becoming a member of DELAD you can subscribe here. Contact for the workshop: Henk van den Heuvel (h.vandenheuvel@let.ru.nl) This event was supported by CLARIN and by the SSHOC Project (Grant Agreement 823782 under H2020)

Files

00-DELAD Workshop 5_Day 1_Intro_HvdH.mp4

Files (2.5 GB)

Name Size Download all
md5:d5d5c74d9f320a2fb0e2a3a0e4555a54
26.6 MB Preview Download
md5:0cc0f0570cb23435373ea997e2bc0e31
514.8 kB Preview Download
md5:bac566b9b044f57b0bd72ed03642d181
110.9 MB Preview Download
md5:06b397f00d6c2c8e41b28f5edd7bca08
357.9 kB Preview Download
md5:f5f072522928a33717003da411e0402c
96.1 MB Preview Download
md5:9d7862b0ca63d2ca4d2b525fff9c1403
934.1 kB Preview Download
md5:5ac44c4196214186f4478244b0c98262
1.2 GB Preview Download
md5:f939e160b3683d84f6d01438bb72914f
4.2 MB Preview Download
md5:d3010950e95487b355d235f48b320aab
102.9 MB Preview Download
md5:c1e15a0fda1ddfe5bde11481b52ffdad
676.7 kB Preview Download
md5:3e5c9925e65d5a91cf4db83648962375
184.0 MB Preview Download
md5:cacd5ba51a26a698eeedd0d2ff87b8c7
1.4 MB Preview Download
md5:c472759c9b64944c9281b710a2a4fbd5
428.3 MB Preview Download
md5:2c986bb8f8f4e24df3d97ddfa1d97924
1.1 MB Preview Download
md5:0f661c89be82542de2b88f74871838f3
134.9 MB Preview Download
md5:8a04d8c1e2f5338121151c0a55a19a8f
1.0 MB Preview Download
md5:e70997f3e3913c7cef4334b11c9a728b
211.5 MB Preview Download
md5:013913c886f7bc6385cd4116cf9f7210
4.4 MB Preview Download

Additional details

Funding

SSHOC – Social Sciences & Humanities Open Cloud 823782
European Commission