Conference paper Open Access

Data Extraction and Synthesis in Systematic Reviews of Diagnostic Test Accuracy: A Corpus for Automating and Evaluating the Process

Norman, Christopher; Leeflang, Mariska; Névéol, Aurélie

Background: Systematic reviews are critical for obtaining accurate estimates of diagnostic test accuracy, yet these require extracting information buried in free text articles, which is often laborious. Objective: We create a dataset describing the data extraction and synthesis processes in 63 DTA systematic reviews, and demonstrate its utility by using it to replicate the data synthesis in the original reviews. Method: We construct our dataset using a custom automated extraction pipeline complemented with manual extraction, verification, and post-editing. We evaluate using manual assessment by two annotators and by comparing against data extracted from source files. Results: The constructed dataset contains 5,848 test results for 1,354 diagnostic tests from 1,738 diagnostic studies. We observe an extraction error rate of 0.06–0.3%. Conclusions: This constitutes the first dataset describing the later stages of the DTA systematic review process, and is intended to be useful for automating or evaluating the process.

Files (101.9 kB)
Name Size
normanAMIA2018_revised.pdf
md5:fb1402dac89ed77e4b6cef3048d1928e
101.9 kB Download
16
12
views
downloads
All versions This version
Views 1616
Downloads 1212
Data volume 1.2 MB1.2 MB
Unique views 1313
Unique downloads 1010

Share

Cite as