Published December 9, 2025 | Version v1
Journal article Open

Challenges in Using AI-Based Citizen-Generated Plant Observations as Forensic Evidence in Biodiversity Investigations

  • 1. AMAP, University of Montpellier, IRD, CNRS, CIRAD, INRAE, Montpellier, France
  • 2. Inria, Montpellier, France

Description

Introduction

Citizen science platforms such as Pl@ntNet (Lefort et al. 2025), iNaturalist, and Flora Incognita have enabled the collection of tens of millions of plant occurrences over the last decade, many shared via the Global Biodiversity Information Facility (GBIF). Thanks to rapid advances in AI-assisted image identification and geospatial analysis, these data streams are increasingly being used for investigative purposes: from monitoring trade flows of Convention on International Trade in Endangered Species (CITES)-listed plants to detecting invasive species, documenting illegal cultivation in protected areas, or time-stamping habitat disturbances.

Results and operational work

AI-assisted observations can support biodiversity-related investigations in several distinct ways, depending on who produces the observations and how they are used. First, automated identification can help customs officers prioritize inspections at border entry points by flagging potentially protected plants—such as rare succulents or orchids—visible in luggage or shipments. In addition, AI-assisted citizen-science observations provide localized confirmation of species' presence in landscapes. Spatiotemporal clustering of such records can indicate the time and location of activities of interest, such as the sudden appearance of a critical invasive species at a sensitive site. Finally, AI-supported citizen-science data can contribute to early-warning systems by revealing anomalies in species-community patterns, prompting environmental agencies responsible for monitoring specific areas to conduct targeted field verification before ecological impacts intensify.

To transform this potential into probative value suitable for management actions, several points are essential. The authenticity of observations requires a strong link between the digital asset and its source, exploiting the metadata of the device used for data collection and the identity of the citizen scientist. Integrity relies on secure storage, requiring the implementation of protection systems and monitoring of any possible transformation. The reliability of the observations themselves is linked to the implementation of calibrated AI scores, as well as the traceability of the models and versions used to generate the predictions. The reproducibility of the employed methods requires transparent metadata that complies with selected standards, in particular through the use of standardized fields, e.g., based on terms of the Darwin Core standard (Darwin Core Maintenance Group 2023).

Scientific and ethical constraints must be taken into account from the outset. Rigorous quantification of uncertainty (calibrated confidence of AI models, identification of consensus predictions involving expert arbitration) are necessary to confirm the presence/absence (or even the abundance) of certain species or groups of species. Ethical safeguards should prevent harm by paying particular attention to sensitive localities, ensuring General Data Protection Regulation (GDPR)-compliant processing of personal data.

Current European projects are making concrete progress in meeting these requirements. Within "safeGUARDing biodiversity and critical ecosystem services across sectors and scales" (GUARDEN) Horizon Europe EU project, we are co-designing workflows with agencies and non-governmental organizations (NGOs) to explore AI-assisted identification and classification services with transparency.

Discussion

Strengthening the reliability of citizen-science infrastructures for potential use in formal investigations requires careful attention to sustainability and interoperability. This includes developing a sustainable management model, backward-compatible APIs, and institutional partnerships to ensure preservation and legal interoperability. Maintaining community trust remains paramount: contributors must see how their observations translate into preservation results while retaining control over visibility and attribution.

Future improvements will consolidate this trajectory. On a large scale, content authenticity should be strengthened by adopting a regularly updated taxonomic repository, drawing on the latest available resources, resulting from work carried out through the Plants of the World Online (POWO) and World Flora Online (WFO) platforms. These taxonomic standardization efforts should be complemented by the use of vocabularies from the Darwin Core standard, as well as other standards from the Biodiversity Information Standards (TDWG) community (such as the World Geographical Scheme for Recording Plant Distributions WGSRPD for example), in order to better guarantee the provenance of the produced data and to clearly document the roles of the various reviewers involved, whether automated AI-based classifiers, volunteer contributors validating observations, or expert botanists providing authoritative identifications.

The robustness and security of implemented services should be reinforced by additional tests to limit poor data quality (drawing in particular on the work of a dedicated group*1 within the TDWG community), the detection of out-of-distribution sources and uncertainties, and the verification of duplicates. Strategies to improve the quality of decisions will need to involve humans, e.g., via active learning workflows (mechanisms that automatically request expert review when uncertainty or legal risk is highest), and expert-based consensus protocols preserving the transparency of divergent opinions. Multimodal validation, combining ground-based photos with drone and satellite images, as well as acoustic sensors, can strengthen chains of evidence.

Conclusion

AI‑assisted, citizen‑generated observations already support screening, intelligence, and early warning in biodiversity enforcement. To elevate their evidential value, we must institutionalize provenance, uncertainty, and ethics by design, linking community platforms, international standards, and operational partners. The GUARDEN and Modern Approaches to the Monitoring of BiOdiversity (MAMBO) pilots illustrate a pathway toward forensic‑ready biodiversity data systems that complement traditional monitoring and help protect living systems in real time.

Files

BISS_article_181619.pdf

Files (116.6 kB)

Name Size Download all
md5:38203135d47febe341e2df1cca97760d
93.7 kB Preview Download
md5:5db83572dc06266fb88690d9f693f3c2
22.9 kB Preview Download

Linked records

Additional details

Related works

Cites
Publication: 10.1111/2041-210x.14486 (DOI)

References

  • Darwin Core Maintenance Group (2023) Darwin Core List of Terms. Biodiversity Information Standards (TDWG). URL: http://rs.tdwg.org/dwc/doc/list/2023-09-18
  • Lefort T, Affouard A, Charlier B, Lombardo J, Chouet M, Goëau H, Salmon J, Bonnet P, Joly A (2025) Cooperative learning of Pl@ntNet's Artificial Intelligence algorithm: How does it work and how can we improve it? Methods in Ecology and Evolution https://doi.org/10.1111/2041-210x.14486