Use of NHS Digital datasets as trial data in the UK: a position paper
Creators
- 1. MRC Clinical Trials Unit at University College London; Health Data Research UK; NHS DigiTrials Programme, NHS Digital
- 2. NHS DigiTrials Programme, NHS Digital
- 3. Nuffield Department of Population Health, University of Oxford; Health Data Research UK
- 4. University of Leeds; Data Services Directorate, NHS Digital
- 5. MRC Clinical Trials Unit at University College London; London School of Hygiene and Tropical Medicine, University of London; Health Data Research UK
- 6. Nuffield Department of Population Health, University of Oxford; Health Data Research UK; NHS DigiTrials Programme, NHS Digital
- 7. MRC Clinical Trials Unit at University College London; Health Data Research UK
- 8. MRC Clinical Trials Unit at University College London; Health Data Research UK; BHF Data Science Centre
Description
Background: Clinical trial teams increasingly want to make use of data from healthcare systems (“healthcare data”), particularly to enhance recruitment and follow-up of participants, to reduce time and cost, and to stop the duplication of effort. However, there is continued uncertainty of how regulators regard healthcare data used for trial purposes, in terms of provenance, quality and reliability.
Objectives: There were two key objectives: First, to demonstrate the data integrity of two datasets held by NHS Digital (NHSD) that are most requested by trial teams; and second, to set out an approach by which any other healthcare systems datasets can be similarly evaluated.
Method: The data lifecycles of the datasets were carefully documented, mapping the flow of data from the originating healthcare provider’s databases to NHSD warehouses and onwards to clinical trials teams. These were assessed for evidence of whether the datasets are accurate, reliable, complete, contemporaneous, and well-governed.
Result: The assessment method was applied to (a) the Hospital Episode Statistics Admitted Patient Care (HES APC) dataset and (b) the Civil Registration of Deaths (CRD) dataset. This paper clearly demonstrates that their collection and management through NHSD systems ensure their integrity and reliability. The datasets are accurate representations of the data held by the originating providers (acute NHS trusts and local registrars).
Conclusion: Based on these findings, the HES APC and CRD datasets satisfy the assessment criteria that demonstrate they are reliable transcribed copies of the original source data.
Implications: First, these datasets can be used directly for clinical trial data, with trial teams focusing on the accuracy of algorithms and processes to identify particular outcomes rather than on the integrity of the data flow. Second, this assessment approach should be used to assess whether other healthcare systems datasets are ready to be used as transcribed copies of source data, and for data providers to take appropriate steps to redress this matter if they are not.
Notes
Notes
Files
Position paper on NHSD data v2.0 20220211 final.pdf
Files
(1.2 MB)
Name | Size | Download all |
---|---|---|
md5:ccef740675348a07d1426514482220e8
|
1.2 MB | Preview Download |
Additional details
Related works
- Has metadata
- Diagram: https://dedicate.healthandcaremetadata.uk/ (URL)
- Is continued by
- Journal article: 10.1177/14604582241276969 (DOI)
- Is described by
- Journal article: 10.1016/S2589-7500(22)00122-4 (DOI)
- Is supplemented by
- 10.5281/zenodo.6047938 (DOI)