Published June 15, 2021 | Version v1
Journal article Open

The Importance of Being External. Methodological insights for the external validation of machine learning models in medicine

  • 1. University of Milano-Bicocca, Viale Sarca 336, 20126, Milano, Italy
  • 2. Computer Science Department, University of Sheffield, Sheffield, UK
  • 3. Laboratory Medicine Department, Hospital Universitario Santa Lucía, Cartagena, Spain
  • 4. National Reference Laboratory for Clinical Chemistry, Ethiopian Public Health Institute, Addis Ababa, Ethiopia
  • 5. Laboratorio di chimica clinica, Ospedale di Desio e Monza, ASST-Monza, Dipartimento di medicina e chirurgia, Università di Milano-Bicocca, Monza, Italy
  • 6. Laboratorio di chimica clinica, Ospedale Papa Giovanni XXIII, Bergamo, Italy
  • 7. IRCCS Ospedale San Raffaele, Via Olgettina, 60, 20132, Milano, Italy

Description

In this repository you can find the pre-print and data associated with the publication titled "The Importance of Being External. Methodological insights for the external validation of machine learning models in medicine.

The data is collected in the file "Datasets.zip" and, more in particular:

  • "all-data-processed-v3.xlsx" contains the Brazil-1, Brazil-2 and Brazil-3 datasets
  • "desio_cbc.xls", "bergamo_cbc.xls" and "HSR_novembre.xlsx" contain the Italy-1, Italy-2 and Italy-3 datasets
  • "CBC Italy Octubre (sent 24-12-2020).xlsx" contains the Spain dataset
  • "Etiopia 200 COVID +.xlsx" and "cbc data_anna copy.xlsx" contains the Ethiopia dataset (positive and negative cases, respectively)

Code showing how to process the data, as well as all code for producing the visualizations shown in the paper is available at https://github.com/AndreaCampagner/qualiMLpy

For any further information, you can contact me at a.campagner@campus.unimib.it

Files

Datasets.zip

Files (3.1 MB)

Name Size Download all
md5:760612b8ece931c1f8e0dbe9106ae9f4
916.2 kB Preview Download
md5:79a0ff4a2376800e12a7f41c08adf3e1
2.2 MB Preview Download