Published April 27, 2022 | Version 0.0.2
Dataset Open

The HumBug Challenge: ComParE 2022


A large-scale multi-species dataset of acoustic recordings

Dataset compatible with two papers:

A large-scale multi-species dataset containing recordings of mosquitoes collected from multiple locations globally, as well as via different collection methods. In total, we present 20 hours of labelled mosquito data with 15 hours of corresponding background noise, recorded at the sites of 8 experiments.  Of these, 64,843 seconds contain species metadata, consisting of 36 species (or species complexes).

This repository contains:

  • Audio files to be extracted into audio/data/train and audio/data/dev/{a/b} respectively
  • Metadata in csv format: neurips_2021_zenodo_0_0_2.csv



Funding from the 2014 Google Impact Challenge Award, and The Bill and Melinda Gates Foundation (


Files (4.4 GB)

Name Size Download all
152.7 MB Preview Download
1.8 MB Preview Download
4.2 GB Preview Download

Additional details