Published May 10, 2018 | Version v1
Conference paper Open

Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat

  • 1. LIMSI, CNRS
  • 2. AMC, University of Amsterdam

Description

Systematic reviews in e.g. empirical medicine address research questions by comprehensively examining the entire published literature. Conventionally, manual literature surveys decide inclusion in two steps, first based on abstracts and title, then by full text, yet current methods to automate the process make no distinction between gold data from these two stages. In this work we compare the impact different schemes for choosing positive and negative examples from the different screening stages have on the training of automated systems. We train a ranker using logistic regression and evaluate it on a new gold standard dataset for clinical NLP , and on an existing gold standard dataset for drug class efficacy. The classification and ranking achieves an average AUC of 0.803 and 0.768 when relying on gold standard decisions based on title and abstracts of articles, and an AUC of 0.625 and 0.839 when relying on gold standard decisions based on full text. Our results suggest that it makes little difference which screening stage the gold standard decisions are drawn from, and that the decisions need not be based on the full text. The results further suggest that common-off-the-shelf algorithms can reduce the amount of work required to retrieve relevant literature.

Files

777.pdf

Files (206.0 kB)

Name Size Download all
md5:4f4fcc8c90a96c61860c867080a55e30
206.0 kB Preview Download

Additional details

Funding

MIROR – Methods in Research on Research 676207
European Commission