Dataset Open Access
Lara Orlandic;
Tomas Teijeiro;
David Atienza
Cough audio signal classification has been successfully used to diagnose a variety of respiratory conditions, and there has been significant interest in leveraging Machine Learning (ML) to provide widespread COVID-19 screening. The COUGHVID dataset provides over 20,000 crowdsourced cough recordings representing a wide range of subject ages, genders, geographic locations, and COVID-19 statuses. Furthermore, experienced pulmonologists labeled more than 2,000 recordings to diagnose medical abnormalities present in the coughs, thereby contributing one of the largest expert-labeled cough datasets in existence that can be used for a plethora of cough audio classification tasks. As a result, the COUGHVID dataset contributes a wealth of cough recordings for training ML models to address the world’s most urgent health crises.
Name | Size | |
---|---|---|
public_dataset.zip
md5:5c30a8b00c8bb7783a2c15a48cb8ea9e |
951.4 MB | Download |
All versions | This version | |
---|---|---|
Views | 3,667 | 3,470 |
Downloads | 1,893 | 1,818 |
Data volume | 1.8 TB | 1.7 TB |
Unique views | 3,165 | 3,048 |
Unique downloads | 1,235 | 1,179 |