Dataset Open Access

Coughs: ESC-50 and FSDKaggle2018

Mahmoud Abdelkhalek; Jinyi Qiu; Michelle Hernandez; Alper Bozkurt; Edgar Lobaton


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.5136592</identifier>
  <creators>
    <creator>
      <creatorName>Mahmoud Abdelkhalek</creatorName>
      <affiliation>North Carolina State University</affiliation>
    </creator>
    <creator>
      <creatorName>Jinyi Qiu</creatorName>
      <affiliation>North Carolina State University</affiliation>
    </creator>
    <creator>
      <creatorName>Michelle Hernandez</creatorName>
      <affiliation>University of North Carolina at Chapel Hill</affiliation>
    </creator>
    <creator>
      <creatorName>Alper Bozkurt</creatorName>
      <affiliation>North Carolina State University</affiliation>
    </creator>
    <creator>
      <creatorName>Edgar Lobaton</creatorName>
      <affiliation>North Carolina State University</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Coughs: ESC-50 and FSDKaggle2018</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2021</publicationYear>
  <subjects>
    <subject>audio dataset</subject>
    <subject>Kaggle</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2021-07-26</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/5136592</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.5136591</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/freesound-datasets</relatedIdentifier>
  </relatedIdentifiers>
  <version>1.0.0</version>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by-nc/4.0/legalcode">Creative Commons Attribution Non Commercial 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;This dataset consists of timestamps for coughs contained in files extracted from the &lt;a href="https://github.com/karolpiczak/ESC-50"&gt;ESC-50&lt;/a&gt; and &lt;a href="https://zenodo.org/record/2552860#.YPip9kBRUZg"&gt;FSDKaggle2018&lt;/a&gt; datasets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Citation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This dataset was generated and used in our paper:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Mahmoud Abdelkhalek, Jinyi Qiu, Michelle Hernandez, Alper Bozkurt, Edgar Lobaton, &amp;ldquo;Investigating the Relationship between Cough Detection and Sampling Frequency for Wearable Devices,&amp;rdquo; in the 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2021.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Please cite this paper if you use the &lt;em&gt;timestamps.csv &lt;/em&gt;file in your work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The cough timestamps given in the &lt;em&gt;timestamps.csv&lt;/em&gt; file were generated using the cough templates given in figures 3 and 4 in the paper:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A. H. Morice, G. A. Fontana, M. G. Belvisi, S. S. Birring, K. F. Chung, P. V. Dicpinigaitis, J. A. Kastelik, L. P. McGarvey, J. A. Smith, M. Tatar, J. Widdicombe, &amp;quot;ERS guidelines on the assessment of cough&amp;quot;, European Respiratory Journal 2007 29: 1256-1276; DOI: 10.1183/09031936.00101006&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;More precisely, 40 files labelled as &amp;quot;coughing&amp;quot; in the ESC-50 dataset and 273 files labelled as &amp;quot;Cough&amp;quot; in the FSDKaggle2018 dataset were manually searched using &lt;a href="https://www.audacityteam.org/"&gt;Audacity&lt;/a&gt; for segments of audio that closely matched the aforementioned templates, both visually and auditorily. Some files did not contain any coughs at all, while other files contained several coughs. Therefore, only the files that contained at least one cough are included in the &lt;em&gt;coughs&lt;/em&gt; directory. In total, the timestamps of 768 cough segments with lengths ranging from 0.2 seconds to 0.9 seconds were extracted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The audio files are presented in &lt;em&gt;wav&lt;/em&gt; format in the &lt;em&gt;coughs&lt;/em&gt; directory. Files named in the general format of &amp;quot;*-*-*-24.wav&amp;quot; were extracted from the ESC-50 dataset, while all other files were extracted from the FSDKaggle2018 dataset.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;timestamps.csv&lt;/em&gt; file contains the timestamps for the coughs and it consists of four columns:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;file_name,cough_number,start_time,end_time&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Files in the &lt;em&gt;file_name&lt;/em&gt; column can be found in the &lt;em&gt;coughs&lt;/em&gt; directory. &lt;em&gt;cough_number&lt;/em&gt; refers to the index of the cough in the corresponding file. For example, if the file &lt;em&gt;X.wav&lt;/em&gt; contains 5 coughs, then &lt;em&gt;X.wav&lt;/em&gt; will be repeated 5 times under the &lt;em&gt;file_name&lt;/em&gt; column, and for each row, the &lt;em&gt;cough_number&lt;/em&gt; will range from 1 to 5. &lt;em&gt;start_time&lt;/em&gt; refers to the starting time of a cough segment measured in seconds, while &lt;em&gt;end_time&lt;/em&gt; refers to the end time of a cough segment measured in seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Licensing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The ESC-50 dataset as a whole is licensed under the &lt;a href="https://creativecommons.org/licenses/by-nc/3.0/"&gt;Creative Commons Attribution-NonCommercial license&lt;/a&gt;. Individual files in the ESC-50 dataset are licensed under different Creative Commons licenses. For a list of these licenses, see &lt;a href="https://github.com/karolpiczak/ESC-50/blob/master/LICENSE"&gt;LICENSE&lt;/a&gt;. The ESC-50 files in the &lt;em&gt;cough&lt;/em&gt; directory are given for convenience only, and have not been modified from their original versions. To download the original files, see the &lt;a href="https://github.com/karolpiczak/ESC-50"&gt;ESC-50&lt;/a&gt; dataset.&lt;/p&gt;

&lt;p&gt;The FSDKaggle2018 dataset as a whole is licensed under the &lt;a href="https://creativecommons.org/licenses/by/4.0/"&gt;Creative Commons Attribution 4.0 International license&lt;/a&gt;. Individual files in the FSDKaggle2018 dataset are licensed under different Creative Commons licenses. For a list of these licenses, see the &lt;em&gt;License&lt;/em&gt; section in &lt;a href="https://zenodo.org/record/2552860"&gt;FSDKaggle2018&lt;/a&gt;. The FSDKaggle2018 files in the &lt;em&gt;cough&lt;/em&gt; directory are given for convenience only, and have not been modified from their original versions. To download the original files, see the &lt;a href="https://zenodo.org/record/2552860"&gt;FSDKaggle2018&lt;/a&gt; dataset.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;timestamps.csv&lt;/em&gt; file is licensed under the &lt;a href="https://creativecommons.org/licenses/by-nc/4.0/"&gt;Creative Commons Attribution-NonCommercial 4.0 International license&lt;/a&gt;.&lt;/p&gt;</description>
    <description descriptionType="Other">This work was supported by the National Science Foundation under award IIS-1915599 and EEC-1160483 (ERC for ASSIST).</description>
  </descriptions>
</resource>
266
22
views
downloads
All versions This version
Views 266266
Downloads 2222
Data volume 1.8 GB1.8 GB
Unique views 225225
Unique downloads 2222

Share

Cite as