Published August 13, 2025 | Version v1
Project deliverable Open

Deliverable 5.5: Querying components

Description

HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY (HEREDITARY) develops novel approaches for analysis and exploration of multi-modal biomedlical data. Multi-modality poses a challenge to data analysis, due to the divergent data types, scales, or levels of detail at which these may be given. Current approaches can represent multi-modal data in unified data structures, such as Knowledge Graphs (KGs), and tabular data. Due to the complex nature of the data, and also, expert-level interfaces like SQL and SPARQL to such data sets, users often have problems to effectively search, find, and discover insights in such data sets with ease.

In this HEREDITARY deliverable, we present novel approaches for effective user access to complex data, including retrieval, explanation, and comparison of data. Specifically, we leverage the potential of state of the art Large Language Models. We show how using natural language, users from novice to expert level, can express their information need in a natural language statement. The system applies these statements to create answers for the user, and can involve users in an information-seeking dialogue. Our approaches operate on KG data (e.g., as obtained from information extraction algorithms of lage publication data), and on tabular data (e.g., structured complex patient information from clinical research). Preliminary evaluation shows the large potential of these approaches for retrieval and exploration in complex and multi-modal data. Building on this, in follow-up work we will refine, integrate and evaluate these querying components as part of requirement engineering, and scientific dissemination.

Files

HEREDITARY_D5_5.pdf

Files (3.6 MB)

Name Size Download all
md5:437adf23017e2ff76f7794aec2327df3
3.6 MB Preview Download

Additional details

Funding

European Commission
HEREDITARY - HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY 101137074