Published January 23, 2023 | Version v1
Dataset Open

A Multimodal Dataset for Automatic Edge-AI Cough Detection

  • 1. Embedded Systems Laboratory (ESL) - EPFL
  • 2. BCAM - Basque Center for Applied Mathematics


Counting the number of times a patient coughs per day is an essential biomarker in determining treatment efficacy for novel antitussive therapies and personalizing patient care. There is a need for wearable devices that employ multimodal sensors to perform accurate, privacy-preserving, automatic cough counting algorithms directly on the device in an edge-AI fashion. To advance this research field, we contribute the first publicly accessible cough counting dataset of multimodal biosignals. The database contains nearly 4 hours of biosignal data, with both acoustic and kinematic modalities, covering 4,300 annotated cough events. Furthermore, several non-cough sounds (i.e. breathing, laughing, and throat clearing), background noises (i.e. music, traffic, bystander coughing) and motion scenarios (i.e. sitting, walking) mimicking daily life activities are also present, which the research community can use to accelerate ML algorithm development.

For detailed information about using this dataset to train edge-AI models and example code, please refer to our public Git repository:


Files (852.4 MB)

Name Size Download all
852.4 MB Preview Download

Additional details


DIGIPREDICT – Edge AI-deployed DIGItal Twins for PREDICTing disease progression and need for early intervention in infectious and cardiovascular diseases beyond COVID-19 101017915
European Commission