Published August 4, 2025
| Version v1
Conference paper
Open
DataPLAN's Model Context Protocol (MCP) Server: Enhance AI Support for Data Management Plans
Creators
- 1. IBG-4: Bioinformatics, CEPLAS, BIOSC, Forschungszentrum Jülich, Wilhelm Johnen Straße, Jülich, Germany
- 2. HHU Düsseldorf, Faculty of Mathematics and Natural Sciences, Institute for Biological Data Science, CEPLAS, Germany
- 3. Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, Gatersleben, Germany
- 4. Computer Center, University of Freiburg, Freiburg im Breisgau, Germany
- 5. Computational Systems Biology, RPTU University of Kaiserslautern, Kaiserslautern, Germany
Contributors
Editors:
- 1. Nationale Forschungsdateninfrastruktur (NFDI) e.V.
- 2. University of Amsterdam
Description
Generating and implementing Data Management Plans (DMPs) effectively can be challenging, as it involves coordinating proposals, software, repositories, and best practices. Platforms like BioChatter [1] help close data knowledge gaps with the Large Language Model (LLM), and DataPLAN [2] supports automated DMP generation. Still, gathering information and implementing DMP requires manual effort. The DataPLAN server for Model Context Protocol (MCP) [3], a secure and standardized communication framework, allows LLMs to securely connect with external resources, making DMP generation and implementation more efficient. The DataPLAN MCP server enhances DMP generation by standardizing context exchange, enriching user inputs (e.g., project aims, data types) with external references like ontologies (e.g., EDAM Ontology [4], DPBO Ontology [5]), repositories (e.g., PLANTdataHUB [6] e!DAL-PGP [7] ), tools (Swate, ARCtrl) and unstructured text. For example, a research proposal in free-text format can be analyzed to extract key metadata—project names, research topics, and objectives—ensuring comprehensive and standards-aligned DMPs. For DMP implementation, by leveraging the DataPLAN MCP server, LLM agents can execute tasks such as data transformation, knowledge management, and compliance monitoring, reducing manual effort. For example, if a DMP specifies the daily conversion of data from ElabFTW [8] to PLANTdataHUB using the elab2arc tool, MCP can direct an LLM agent to schedule and execute the conversion while ensuring proper documentation of completion. This automation mitigates errors and enhances adherence to evolving data management guidelines. Our MCP server provides a flexible foundation for cross-consortium DMP generation and implementation, with integration facilitated through DMP4NFDI (a Base4NFDI service). As part of an incubator project with NFDI4DS, DataPLANT is contributing to the development of Software Management Plans, with a focus on data exchange between RDMO and DataPLAN. Our MCP is able to convert free-text inputs into machine-actionable JSON, which aligns perfectly with the goal of the DMP4NFDI incubator project. Looking ahead, the MCP will support harmonized DMP implementation across life science consortia (including NFDI4Biodiversity, NFDI4Microbiota, NFDI4Bioimage and FAIRagro) under the coordination of the BioData Interest Group. This will promote unified, AI-enhanced data management practices across domains. Early integrations will include tools such as ISA-Wizard, with additional tools planned for future updates. With its modular and extensible architecture, the MCP ecosystem will continue to improve as MCP adaptation grows—empowering more effective, collaborative, and FAIR-aligned research data management across disciplines. Integrating MCP into DMP creation tools not only facilitates the documentation generation but also transforms static DMP documents into dynamic, adaptable frameworks that evolve alongside research workflows. This AI-enabled strategy promotes reproducibility, enhances scalability, and increases the efficiency of research workflows.
Files
CoRDI_2025_paper_218.pdf
Files
(170.7 kB)
Name | Size | Download all |
---|---|---|
md5:44eb78203dec6386bc4155cedd42626f
|
170.7 kB | Preview Download |