Published May 4, 2021 | Version v1
Dataset Open

GES classification dataset

  • 1. Center for Mathematical Modeling, University of Chile

Description

Background

In Chile, a patient needing a specialty consultation or surgery has to first be referred by a general practitioner, then placed on a waiting list. The Explicit Health Guarantees (GES in Spanish) ensure, by law, the maximum time to solve an important set of health problems. Usually, a health professional manually verifies if each referral, written in natural language, corresponds or not to a GES-covered disease. An error in this classification is catastrophic for patients, as it puts them on a non-prioritized waiting list, characterized by prolonged waiting times.

Methods

To support the manual process, we developed and deployed a system that automatically classifies referrals as GES-covered or not using historical data. Our system is based on word embeddings specially trained for clinical text produced in Chile. We used a vector representation of the reason for referral and patient's age as features for training machine learning models using human-labeled historical data. We constructed a ground truth dataset combining classifications made by three healthcare experts, which was used to validate our results.

Results

The best performing model over ground truth reached an AUC score of 0.94. During seven months of continuous and voluntary use, the system has amended 87 patient misclassifications.

Conclusion

This system is a result of a collaboration between technical and clinical experts, and the design of the classifier was custom-tailored for a hospital's clinical workflow, which encouraged the voluntary use of the platform. Our solution can be easily expanded across other hospitals since the registry is uniform in Chile.

Files

ground_truth.csv

Files (32.3 kB)

Name Size Download all
md5:27834af7ababf26eef1a0ee8a89ed6da
32.3 kB Preview Download