Working paper Open Access
Jefferson, Emily; Liley, James; Malone, Maeve; Reel, Smarti; Crespi-Boixader, Alba; Kerasidou, Xaroula; Tava, Francesco; McCarthy, Andrew; Preen, Richard; Blanco-Justicia, Alberto; Mansouri-Benssassi, Esma; Domingo-Ferrer, Josep; Beggs, Jillian; Chuter, Antony; Cole, Christian; Ritchie, Felix; Daly, Angela; Rogers, Simon; Smith, Jim
This is a working version of the GRAIMATTER recommendations for disclosure control of machine learning models from trusted research environments. The document has already undergone a closed consultation period and we are now seeking feedback during an open consultation period, ending on the 15th of August 2022. Interested parties include those who run TREs, researchers, data governance teams, ethics and legal teams, AI/ML experts and data privacy experts.
If you would like to provide feedback, please download the attached feedback form and send it to Smarti Reel (firstname.lastname@example.org) and Emily Jefferson (email@example.com).
Within the document, we have added notes to the reader in italics. Although it is too soon to have experience with these recommendations in practice, we welcome feedback on the likely implementability of our recommendations, singly or in combination.
Please note: The revised/updated codes for Appendix H: Case Studies can be found here: Examples Appendix H-Case Studies.pdf
TREs are widely, and increasingly used to support statistical analysis of sensitive data across a range of sectors (e.g., health, police, tax and education) as they enable secure and transparent research whilst protecting data confidentiality.
There is an increasing desire from academia and industry to train AI models in TREs. The field of AI is developing quickly with applications including spotting human errors, streamlining processes, task automation and decision support. These complex AI models require more information to describe and reproduce, increasing the possibility that sensitive personal data can be inferred from such descriptions. TREs do not have mature processes and controls against these risks. This is a complex topic, and it is unreasonable to expect all TREs to be aware of all risks or that TRE researchers have addressed these risks in AI-specific training.
GRAIMATTER has developed a draft set of usable recommendations for TREs to guard against the additional risks when disclosing trained AI models from TREs.
This work was funded by UK Research and Innovation Grant Number MC_PC_21033 as part of Phase 1 of the DARE UK (Data and Analytics Research Environments UK) programme (https://dareuk.org.uk/), delivered in partnership with HDR UK and ADRUK. The specific project was Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMATTER).
Recommendations for disclosure control of trained Machine Learning (ML) models from Trusted Research Environments (TREs).pdf