Published April 1, 2022 | Version v1
Book Open

Translational Data Science in Population Health

  • 1. Leiden University


This work reaffirms translational data science as an independent discipline. It explains the why, how and what. The overarching storyline runs from conceptual science policy to unruly implementation in daily practice. The Cross-Industry Standard Process for Data Mining (CRISP-DM) process offers a multi-layered and cyclic step-by-step guideline for carrying out each data analysis task within each phase of the translational data science process in a standardised and already proven manner. Each phase introduces several relevant research areas of interest to illustrate the rich academic opportunities within translational data science.

First, we need to better and explicitly understand the application domain under investigation, because we want to proceed on the basis of practical use considerations. Second, the necessary data is collected, described, interactively explored and its data quality assessed, to experientially observe the value variety within the data. Third, we make all necessary data preparations to be able to carry out the subsequent data analysis. Fourth, we select and run the most suitable algorithms for each analysis objective and technically evaluate their performances. Fifth, we assess the extent to which the data model can be meaningfully interpreted to address the practical use considerations. Sixth and finally, we implement the prediction model for use in daily practices, and a better fundamental understanding of the exact behaviour of the data analysis technique in a real-world scenario.


Extended transcript of the inaugural lecture of Marco Spruit on the acceptance of the position of professor of Advanced Data Science in Population Health on 1 April 2022 at Leiden University's medical and science faculties.


Marco Spruit (2022) Inaugural lecture booklet - Translational Data Science in Population Health.pdf