Published February 12, 2022 | Version v1
Dataset Open

Data Set on Accuracy of Symptom Checker Apps in 2020

  • 1. Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin
  • 2. Institute of General Practice and Family Medicine, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin
  • 3. Division of Ergonomics, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin

Description

These two data sets present the accuracy of triage (disposition) and diagnostic advice of symptom checker apps sampled in 2020. The sample consists of 22 commonly used symptom checker apps, of which 14 also provide diagnostic advice. The apps were tested on 45 case vignettes, i.e. fictitious descriptions of patients. As not every app was able to appraise every vignette our study yielded a total of 796 unique triage evaluations and 520 unique diagnostic evaluations. The data sets are a supplement to the paper "Triage Accuracy of Symptom Checker Apps: A Five-year Follow-up Evaluation" (doi: 10.2196/31810).

The was collected by Anna Dames as partial requirement for her MSc degree in Human Factors in the Department of Psychology and Ergonomics (IPA) at Technische Universität Berlin.

The clinical vignettes were originally compiled and modified by Semigran et al. in 2015 (https://doi.org/10.1136/bmj.h3480), and further adapted by Hill et al. (2020) (doi: 10.5694/mja2.50600) and in the study these data sets are supplement to (doi: 10.2196/31810).

Notes

Explanations of coding 1) Each case vignette was assigned a goldstandard triage level by an expert panel (see Semigran et al. (2015)). These are coded as follows: "Em" for "emergency care required"; "NE" for "Non-emergency care required" and "Sc" for "Self-care sufficient". For those apps which only flag emergency, "non-em" was coded for evaluations where they did not consider vignette to be an emergency. 2) Diagnostic accuracy is categorized into three levels: an app providing the correct diagnostic suggestion as first suggestion ("1"), within the first ten suggestions ("10"), or not providing the correct diagnostic suggestion at all ("0").

Files

Files (57.5 kB)

Name Size Download all
md5:628bcd3e02a8f04934295b4037b7cb8c
19.2 kB Download
md5:6736e44bb7a72c6bbe8eb25b6b173dc4
38.2 kB Download

Additional details

Related works

Is supplement to
Journal article: 10.2196/31810 (DOI)

References

  • Hill MG, Sim M, Mills B. The quality of diagnosis and triage advice provided by free online symptom checkers and apps in Australia. Medical Journal of Australia [Internet] 2020 May 11 [cited 2020 May 14];mja2.50600. [doi: 10.5694/mja2.50600]
  • Semigran HL, Linder JA, Gidengil C, Mehrotra A. Evaluation of symptom checkers for self diagnosis and triage: audit study. BMJ [Internet] 2015 Jul 8 [cited 2018 Jun 3];h3480. [doi: 10.1136/bmj.h3480]
  • Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage Accuracy of Symptom Checker Apps: A Five-year Follow-up Evaluation (Preprint). Journal of Medical Internet Research [Internet] 2021 Jul 7 [cited 2022 Feb 6]; [doi: 10.2196/31810]