Published April 29, 2023 | Version v1
Dataset Open

Documentary sources of case studies on the issues a data protection officer faces on a daily basis

  • 1. University of Trento
  • 2. University of Trento, Vrije Universiteit Amsterdam

Description

The dataset contains the text of the documents that are sources of evidence used in [1] and [2] to distill our reference scenarios according to the methodology suggested by Yin in [3].

The dataset is composed of 95 unique document texts spanning the period 2005-2022. This dataset makes available a corpus of documentary sources useful for outlining case studies related to scenarios in which the DPO finds himself operating in the performance of his daily activities.

The language used in the corpus is mainly Italian, but some documents are in English and French. For the reader's benefit, we provide an English translation of the title of each document.

The documentary sources are of many types (for example, court decisions, supervisory authorities' decisions, job advertisements, and newspaper articles), provided by different bodies (such as supervisor authorities,  data controllers, European Union institutions, private companies, courts, public authorities, research organizations, newspapers, and public administrations),  and redacted from distinct professional roles (for example, data protection officers, general managers, university rectors, collegiate bodies, judges, and journalists).

The documentary sources were collected from 31 different bodies. Most of the documents in the corpus (a total of 83 documents) have been transformed into Rich Text Format (RTF), while the other documents (a total of 12) are in PDF format. All the documents have been manually read and verified.
The dataset is helpful as a starting point for a case studies analysis on the daily issues a data protection officer face. Details on the methodology can be found in the accompanying papers.

The available files are as follows:

  • documents-texts.zip --> contain a directory of .rtf files (in some cases .pdf files) with the text of documents used as sources for the case studies. Each file has been renamed with its SHA1 hash so that it can be easily recognized.
  • documents-metadata.csv --> Contains a CSV file with the metadata for each document used as a source for the case studies.

This dataset is the original one used in the publication [1] and the preprint containing the additional material [2].

[1] F. Ciclosi and F. Massacci, "The Data Protection Officer: A Ubiquitous Role That No One Really Knows" in IEEE Security & Privacy, vol. 21, no. 01, pp. 66-77, 2023, doi: 10.1109/MSEC.2022.3222115, url: https://doi.ieeecomputersociety.org/10.1109/MSEC.2022.3222115.

[2] F. Ciclosi and F. Massacci, "The Data Protection Officer, an ubiquitous role nobody really knows." arXiv preprint arXiv:2212.07712, 2022.

[3] R. K. Yin, Case study research and applications. Sage, 2018.

Files

documents-metadata.csv

Files (322.6 MB)

Name Size Download all
md5:0b31ec611659e72a9263ae9713a5fb1b
44.3 kB Preview Download
md5:c6bad6c36bc78ba08b575b7d3eea0ed0
322.6 MB Preview Download

Additional details

Related works

Is described by
Journal article: 10.1109/MSEC.2022.3222115 (DOI)
Preprint: 10.48550/arXiv.2212.07712 (DOI)

Funding

CyberSec4Europe – Cyber Security Network of Competence Centres for Europe 830929
European Commission

References

  • F. Ciclosi and F. Massacci, "The Data Protection Officer: A Ubiquitous Role That No One Really Knows" in IEEE Security & Privacy, vol. 21, no. 01, pp. 66-77, 2023, doi: 10.1109/MSEC.2022.3222115, url: https://doi.ieeecomputersociety.org/10.1109/MSEC.2022.3222115.
  • F. Ciclosi and F. Massacci, "The Data Protection Officer, an ubiquitous role nobody really knows." arXiv preprint arXiv:2212.07712, 2022.
  • R. K. Yin, Case study research and applications. Sage, 2018.