Published February 25, 2025 | Version v1
Presentation Open

AI4DiTraRe: Towards LLM-Based Information Extraction for Standardising Climate Research Repositories

Description

In the petabyte-era of climate research, harmonising diverse environmental and geoscientific datasets is critical to improve data interoperability and support effectiveness of interdisciplinary studies. This paper presents an idea of designing an LLM-based tool to extract and standardize metadata from climate research repositories. The solution leverages the adaptability of LLMs that are able to understand contextual nuances. By addressing common inconsistencies such as varying parameters (observation types), units, and definitions, the proposed tool will significantly improve effective data integration. It will be the first step to facilitate the creation of a unified metadata schema adhering to the FAIR principles.

Series information

This is a presentation of a position paper which was accepted for publication in the First AAAI Bridge on Artificial Intelligence for Scholarly Communication AI4SC, 25-26 February 2025 - Philadelphia, Pennsylvania, USA; co-located with the 39th AAAI Conference on Artificial Intelligence (AAAI-25).

Files

2025_02_25_AI4SC_AAAI_DiTraRe_FIZ_IMK.pdf

Files (4.0 MB)

Name Size Download all
md5:7782ab3f373b6a611835c1394b3f590f
4.0 MB Preview Download

Additional details

Related works

Continues
Proposal: 10.5281/zenodo.11109405 (DOI)
Describes
Conference paper: 10.5281/zenodo.14872359 (DOI)

Funding

Leibniz Association
Leibniz Science Campus "Digital Transformation of Research" W74/2022

Dates

Available
2025-02-25
Date of the presentation