Published February 6, 2026 | Version v1
Image Open

Dataset of Parasitoid Wasps and Associated Hymenoptera (DAPWH)

Description

Comprising the families Ichneumonidae and Braconidae, these parasitoid wasps are ecologically critical for the regulation of insect populations, yet they remain one of the most taxonomically challenging groups due to their cryptic morphology and vast number of undescribed species. To address the scarcity of robust digital resources for these key groups, we present a curated image dataset designed to advance automated identification systems. The dataset contains 3,556 high-resolution images, primarily focused on Neotropical Ichneumonidae and Braconidae, while also including supplementary families such as Andrenidae, Apidae, Bethylidae, Chrysididae, Colletidae, Halictidae, Megachilidae, Pompilidae, and Vespidae to improve model robustness. Crucially, a subset of 1,739 images is annotated in COCO format, featuring multi-class bounding boxes for the full insect body, wing venation, and scale bars. This resource provides a foundation for developing computer vision models capable of identify this families.

Specimens representing the families Braconidae and Ichneumonidae were primarily acquired from the Coleção Taxonômica do Departamento de Ecologia e Biologia Evolutiva da UFSCar (DCBU) . The dataset comprises high-resolution images of specimens from the superfamily Ichneumonoidea as well as nine additional hymenopteran families: Andrenidae, Apidae, Bethylidae, Chrysididae, Colletidae, Halictidae, Megachilidae, Pompilidae, and Vespidae

The Apoidea families (Andrenidae, Apidae, Colletidae, Halictidae, Megachilidae) and Chrysididae were primarily sourced from the Coleção Entomológica Prof. J.M.F. Camargo (RPSP), in addition to contributions from Museu de Zoologia da Universidade de São Paulo and the Spencer Collection. Finally, specimens for Bethylidae, Pompilidae, and Vespidae were acquired from MZUSP 

The repository includes the raw image files organized by taxonomic family and comprehensive metadata provided in the COCO (Common Objects in Context) annotation format via JSON files, facilitating its immediate use in object detection and classification pipelines.

Files

DAPWH.zip

Files (32.1 GB)

Name Size Download all
md5:1434d2bf974c030d9f9a95e6369b7ff5
32.1 GB Preview Download

Additional details

Software

Repository URL
https://github.com/joaomh/DAPWH-2026
Programming language
Python