Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published July 23, 2022 | Version v1
Working paper Open

Recommendations for disclosure control of trained Machine Learning (ML) models from Trusted Research Environments (TREs)

  • 1. Division of Population Health and Genomics, School of Medicine, University of Dundee. & University of Glasgow
  • 2. Department of Mathematical Sciences, Durham University.
  • 3. Dundee Law School School of Humanities Social Sciences and Law, University of Dundee.
  • 4. Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee.
  • 5. Division of Population Health and Genomics, School of Medicine, University of Dundee.
  • 6. Department of Health and Social SciencesUniversity of the West of England, Bristol
  • 7. Department of Computer Science and Creative Technologies, University of the West of England, Bristol
  • 8. Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Tarragona, Catalonia
  • 9. Public/patient advocate, University of Dundee, Dundee.
  • 10. Bristol Business School, University of the West of England, Bristol
  • 11. Leverhulme Research Centre for Forensic Science, School of Science and Engineering, University of Dundee.
  • 12. School of Computing Science, University of Glasgow, Glasgow.

Description

Consultation:

This is a working version of the GRAIMATTTER recommendations for disclosure control of machine learning models from trusted research environments. The document has already undergone a closed consultation period and we are now seeking feedback during an open consultation period, ending on the 15th of August 2022. Interested parties include those who run TREs, researchers, data governance teams, ethics and legal teams, AI/ML experts and data privacy experts.

If you would like to provide feedback, please download the attached feedback form and send it to Smarti Reel (sreel@dundee.ac.uk) and Emily Jefferson (erjefferson@dundee.ac.uk).

Within the document, we have added notes to the reader in italics. Although it is too soon to have experience with these recommendations in practice, we welcome feedback on the likely implementability of our recommendations, singly or in combination.

Background: 

TREs are widely, and increasingly used to support statistical analysis of sensitive data across a range of sectors (e.g., health, police, tax and education) as they enable secure and transparent research whilst protecting data confidentiality.

There is an increasing desire from academia and industry to train AI models in TREs. The field of AI is developing quickly with applications including spotting human errors, streamlining processes, task automation and decision support. These complex AI models require more information to describe and reproduce, increasing the possibility that sensitive information regarding secure data can be inferred from such descriptions. TREs do not have mature processes and controls against these risks. This is a complex topic, and it is unreasonable to expect all TREs to be aware of all risks or that TRE researchers have addressed these risks in AI-specific training.

GRAIMATTER has developed a draft set of usable recommendations for TREs to guard against the additional risks when disclosing trained AI models from TREs.

This work was funded by UK Research and Innovation Grant Number MC_PC_21033 as part of Phase 1 of the DARE UK (Data and Analytics Research Environments UK) programme (https://dareuk.org.uk/), delivered in partnership with HDR UK and ADRUK. The specific project was Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER).­­

Notes

This work was funded by UK Research and Innovation Grant Number MC_PC_21033 as part of Phase 1 of the DARE UK (Data and Analytics Research Environments UK) programme (https://dareuk.org.uk/), delivered in partnership with HDR UK and ADRUK. The specific project was Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER).­ This project has also been supported by MRC and EPSRC [grant number MR/S010351/1]: PICTURES.

Files

Finding private information from publicly available data.pdf