Published July 23, 2025 | Version v3
Project milestone Open

eDNAqua-Plan Milestone 7: The eDNAqua-Plan conceptual landscape proposal v1

  • 1. ROR icon Flanders Marine Institute
  • 2. EBML-EBI
  • 3. UNESCO
  • 4. INRAE
  • 5. University of Valencia
  • 6. BIOPOLIS | CIBIO - Research Centre in Biodiversity and Genetic Resources
  • 7. CIBIO-InBIO/BIOPOLIS
  • 8. ROR icon Finnish Environment Institute
  • 9. ROR icon European Marine Biological Resource Centre
  • 10. ROR icon European Bioinformatics Institute
  • 1. ROR icon Instituut voor Landbouw en Visserijonderzoek
  • 2. University of Duisburg-Essen
  • 3. ROR icon University of Minho
  • 4. ROR icon Wageningen University & Research
  • 5. Sorbonne University
  • 6. ROR icon Finnish Environment Institute
  • 7. European Marine Observation and Data Network
  • 8. ROR icon Global Biodiversity Information Facility
  • 9. ROR icon University of Copenhagen
  • 10. Swedish University of Agricultural Sciences
  • 11. US Geological Survey
  • 12. ROR icon Intergovernmental Oceanographic Commission of UNESCO

Description

The widespread adoption of NGS sequencing technologies has revolutionised aspects of biodiversity research and monitoring, by allowing analysis of environmental DNA (eDNA) in aquatic and other environments. This promoted an escalation of biological measurement capacity, wide taxonomic coverage and consistent data-driven findings. The data ecosystem covers a gamut of entities: sample collection, sequencing, sequence processing, taxonomic identification, based upon reference library construction and taxonomies. This ecosystem includes (meta)data repositories, (meta)data standards, schemas, and formats. The ecosystem also encompasses information that flows to and from the various agents and organisations involved. Furthermore it covers scientific papers and general scientific data repositories.
The current aquatic eDNA landscape has many unconnected or suboptimal links. Generally, the various data entities are stored in a plethora of repositories: regional, national, global and thematic, but also institutional and sometimes only within scientific papers. Different, and sometimes not acknowledged, standards are used in these repositories. As a result, the FAIR principles are not being fully applied, resulting often in data that is not easily found, is inaccessible or unusable. This applies to both research and commissioned biomonitoring of aquatic eDNA and impacts the European aquatic biodiversity scientific and governmental communities that are not being properly served.
Our mini-LLM exploration of scientific literature (Woollard et al. 2025) revealed the extent to which metadata about laboratory and bioinformatics workflows can be extracted from papers. This is however an expensive and unreliable way to extract metadata than simply pulling it in a machine readable state from the appropriate repository.
In this document, we outline the current state of the European, aquatic, eDNA data ecosystem. We then propose a future landscape in which the various gaps are addressed and the (meta)data flow made more FAIR, efficient, and sustainable. The proposed landscape outlined here, is currently in draft form, and will be finalised in the final deliverable from WP3. This future data ecosystem will be achieved by an increased adoption of interoperable standards, and an expansion of the minimal metadata being collected. A common theme is that of federation rather than centralisation, as this is a pragmatic response to the desire to have institutional, national, or thematic repositories, and to the cost of building new centralised repositories. Using agreed interoperable (meta)data standards and adopting widely-used web technologies (e.g. W3C) for data exposure and sharing, will increase the resilience of the overall data infrastructure, safeguarding against the case of any one resource ceasing to function, e.g. due to lack of resource or funding or other reasons. A proper data management plan created by each research group before embarking on the work, would make collection, collation and deposition of (meta)data to appropriate repositories more straightforward and efficient. Furthermore, engagement with standards organisations means that the standards will be more sustainable, have a wider community acceptance, and result in wider (meta)data interoperability. This will thus provide the European aquatic biodiversity scientific and governmental communities with more, and higher quality, data.
There will be significant financial costs associated with the improvements of the repositories, developing data management capacity within the community and also time costs associated with the researchers depositing more and better quality meta(data), but this will result in increased availability of valuable biodiversity data for decades and beyond. Furthermore, increasingly massive amounts of related data e.g. oceanographic data is being collected by remote sensing; simple metadata such as location (latitude, longitude) and date allows integration of valuable eDNA data alongside these physical/chemical information. It will contribute to getting continued value from the expansive research conducted across the European landscape and for the benefit of biodiversity life on earth.
The graphical current and future landscapes are what we recommend readers to concentrate on. This report includes a textual description that aims to provide further context and detail than could be provided on the map.
This draft proposal has benefitted from significant input from WP4 and WP5 to improve and finalise this current and future digital landscape.

Files

eDNAqua-PlanWP3_conceptual_landscape_proposal-v1.1.pdf

Files (4.4 MB)

Additional details

Funding

European Commission
eDNAqua-plan - A Plan towards an eDNA reference library and data repository for Aquatic Organisms, navigating Europe towards the next generation biodiversity monitoring 101112800