Published July 23, 2023 | Version v1
Presentation Open

Data Lakehouse to support the development of AI models for predicting patient response to anti-tumor therapies

  • 1. CEA
  • 2. Universidade de Lisboa
  • 3. Fundacio Eurecat
  • 4. University of Gdansk
  • 5. University of Rome Tor Vergata

Description

In the context of the European project KATY on precision medicine, we prototyped a Data Lakehouse by integrating research studies that generated molecular profiling data from cohorts of kidney tumor tissues taken from patients included in drug clinical trials. Indeed, there is currently a lack of a database dedicated to support the development of AI models to help doctors in chosing the best drug for each patient.
The Data Lakehouse architecture, which we have implemented with open source Delta Lake technology, brings together the best features of Data Lake and Data Warehouse.
The Data Lakehouse will allow three types of access for the KATY consortium members:
- the implementation of data analytics approaches to query and visualize molecular and clinical data
- the targeted extraction of data for the training and testing of AI models
- feeding a Knowledge Graph to support the explainability of the predictive models using a priori biological and clinical knowledge.

Files

Coquelet_ECCB-ISMB2023.pdf

Files (1.9 MB)

Name Size Download all
md5:3c0c1b6637c98a16406653e7a573850a
1.9 MB Preview Download

Additional details

Funding

CANVAS – Enhancing Cancer Vaccine Science for New Therapy Pathways 101079510
European Commission
KATY – Knowledge At the Tip of Your fingers: Clinical Knowledge for Humanity 101017453
European Commission