LLM-Assisted Variable Annotation using the I-ADOPT Framework
Description
Within our NFDI4Earth-Pilot, we are jointly developing an LLM-assisted variable annotation service that leverages recent advances in Large Language Models (LLMs) to automate the semantic decomposition of variable descriptions. Our approach employs the community-driven I-ADOPT framework to break down natural-language variable definitions into essential atomic elements, ensuring naming consistency and interoperability across domains. The system incorporates retrieval-augmented generation (RAG) to access relevant literature and controlled vocabularies, enabling more precise annotations and reducing manual effort for data producers. By aligning AI-driven methods with established semantic standards, our work addresses several focus areas in Earth System Sciences—including Foundation Models & LLMs, metadata management, and data workflows—while also supporting the broader objectives of NFDI and higher-level initiatives like the EOSC. This approach, hence, enhances reproducibility, interoperability, and cross-disciplinary collaboration in day-to-day research data stewardship.
Files
DSgG2025_IADOPT_LLM_Service.pdf
Files
(2.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:874cdb8152fbe55464f598073ba4abd1
|
2.3 MB | Preview Download |
Additional details
Software
- Programming language
- Python