Dataset Open Access

Shared Acoustic Codes Underlie Emotional Communication in Music and Speech - Evidence from Deep Transfer Learning (Datasets)

Coutinho, Eduardo


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/52d79396-beab-495a-9984-5e0e2205aadc/annotations.zip"
      }, 
      "checksum": "md5:0fd3f99bf022f0825ccd7a0225cfad06", 
      "bucket": "52d79396-beab-495a-9984-5e0e2205aadc", 
      "key": "annotations.zip", 
      "type": "zip", 
      "size": 19091
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/52d79396-beab-495a-9984-5e0e2205aadc/features.zip"
      }, 
      "checksum": "md5:3e796d71af35588f9b328a1cbb6a9431", 
      "bucket": "52d79396-beab-495a-9984-5e0e2205aadc", 
      "key": "features.zip", 
      "type": "zip", 
      "size": 111531131
    }
  ], 
  "owners": [
    24641
  ], 
  "doi": "10.5281/zenodo.345944", 
  "stats": {
    "version_unique_downloads": 52.0, 
    "unique_views": 637.0, 
    "views": 685.0, 
    "version_views": 684.0, 
    "unique_downloads": 52.0, 
    "version_unique_views": 636.0, 
    "volume": 4796640455.0, 
    "version_downloads": 85.0, 
    "downloads": 85.0, 
    "version_volume": 4796640455.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.345944", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.600657", 
    "bucket": "https://zenodo.org/api/files/52d79396-beab-495a-9984-5e0e2205aadc", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.600657.svg", 
    "html": "https://zenodo.org/record/345944", 
    "latest_html": "https://zenodo.org/record/345944", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.345944.svg", 
    "latest": "https://zenodo.org/api/records/345944"
  }, 
  "conceptdoi": "10.5281/zenodo.600657", 
  "created": "2017-06-01T14:38:18.006580+00:00", 
  "updated": "2020-01-24T19:24:47.578372+00:00", 
  "conceptrecid": "600657", 
  "revision": 15, 
  "id": 345944, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.345944", 
    "description": "<p>This repository contains the datasets used in the article \"Shared Acoustic Codes Underlie Emotional Communication in Music and Speech - Evidence from Deep Transfer Learning\" (Coutinho &amp; Schuller, 2017).\u00a0</p>\n\n<p>In that article four different data sets were used: SEMAINE, RECOLA, ME14 and MP (acronyms and datasets described below). The SEMAINE (speech) and ME14 (music) corpora were used for the unsupervised\u00a0training of the Denoising Auto-encoders (domain adaptation stage) - only the audio features extracted from the audio files in these corpora were used and are provided in this repository. The RECOLA (speech) and MP (music) corpora were used for the supervised\u00a0training phase - \u00a0both the audio features extracted from the audio files and the Arousal and Valence annotations were used. In this repository, we provide the audio features extracted from the audio files for both corpora, and Arousal and Valence annotations for some of the music datasets (those that the author of this repository is the data curator).</p>\n\n<p>Below, you can find description of the various corpora, the details about the data stored in this repository and information on how to obtain the rest of the data used by Coutinho and Schuller (2017).</p>\n\n<p><strong>SEMAINE (speech)</strong></p>\n\n<p>The SEMAINE corpus (McKeown, Valstar, Cowie, Pantic &amp; Schroder, 2012) was developed specifically to address the task of achieving emotion-rich interactions, and it is adequate for this task as it comprises a wide range of emotional speech. It includes video and speech recordings of spontaneous interactions between human and emotionally stereotyped `characters'. Coutinho &amp; Schuller (2017) used a subset of this database (called <em>Solid-SAL</em>). The <em>Solid-SAL</em> dataset is freely available for scientific research purposes (see http://semaine-db.eu). This repository includes the audio features used in\u00a0Coutinho &amp; Schuller (2017) (under features/SEMAINE).</p>\n\n<p><strong>RECOLA (speech)</strong></p>\n\n<p>The RECOLA database (Ringeval, Sonderegger, Sauer &amp; Lalanne, 2013) consists of multimodal recordings (audio, video, and peripheral physiological activity) of spontaneous dyadic interactions between French adults.\u00a0Coutinho &amp; Schuller (2017) used the RECOLA-Audio module which consists of the audio recordings of each participant in the dyadic phase of the task. In particular, they used the non-segmented high-quality audio signals (WAV format, 44.1kHz, 16bits), obtained through unidirectional headset microphones, of the first five minutes of each interaction. Annotations consist of time-continuous ratings of the level of Arousal and Valence dimensions of emotion perceived by each rater while seeing and listening the audio-visual recordings of each participant task. The publicly available annotated dataset includes only part of the data which amounts to a total number of\u00a023 instances. The time frame length used by Coutinho &amp; Schuller (2017)\u00a0is 1s (the original annotations were downsampled).\u00a0This repository includes the audio features used in\u00a0Coutinho &amp; Schuller (2017) (under features/RECOLA). To obtain the annotations you should contact the author of the original study (see https://diuf.unifr.ch/diva/recola/download.html for further details).</p>\n\n<p><strong>ME14 (music)</strong></p>\n\n<p>The MediaEval ``Emotion in Music'' task is dedicated to the estimation of Arousal and Valence scores continuously in time and value for song excerpts from the Free Music Archive.\u00a0Coutinho and Schuller (2017) used the whole corpus (development and test sets for the 2014 challenge) which includes 1,744 songs belonging to 11 musical styles -- Soul, Blues, Electronic, Rock, Classical, Hip-Hop, International, Folk, Jazz, Country, and Pop (maximum of five songs per artist). This repository includes the audio features used in Coutinho &amp; Schuller (2017) (under features/ME14). The full dataset (including annotations) can be obtained from http://www.multimediaeval.org/mediaeval2014/emotion2014/.</p>\n\n<p><strong>MP (music)</strong></p>\n\n<p>This is a corpus compiled specifically for this work described in Coutinho &amp; Schuller (2017) using data collected in four\u00a0previous studies. It consists of emotionally diverse full music\u00a0pieces from a variety of musical styles (Classical and contemporary Western Art, Baroque, Bossa Nova, Rock, Pop, Heavy Metal, and Film Music). Annotations were obtained in controlled laboratory experiments whereby the emotional character of each piece was evaluated time-continuously in terms of levels of\u00a0Arousal and Valence perceived by listeners (ranging between 35 to 52 in the four studies). In what follows, some details about the various studies are described.</p>\n\n<ul>\n\t<li>MP<sub>DB1</sub>:\u00a0This subset of the MP corpus consists of the data reported by Korhonen (2004), and gently made available by the author. This dataset includes six full (or long excerpts) music pieces ranging from 151s to 315s in length (only classical music).\u00a0Each piece was annotated by 35 participants (14 females).\u00a0The time series correspondents to each music piece were collected at 1Hz.\u00a0The golden standard for each piece was computed by averaging the individual time series across all raters. This repository includes the audio features used in\u00a0Coutinho &amp; Schuller (2017) (under features/MP/DB1). To obtain the labels please contact the author of the original study.</li>\n\t<li>MP<sub>DB2</sub>:\u00a0The dataset by\u00a0Coutinho &amp; Cangelosi (2011) includes 9 full pieces (43s to 240s long) of classical music (romantic repertoire) annotated by 39 subjects (19 females).\u00a0Values were recorded every time the mouse was moved with a precision of 1 ms.\u00a0The resultant timeseries were then resampled (moving average) to a synchronous rate of 1 Hz.\u00a0The golden standard for each piece was computed by averaging the individual time series across all raters. This repository includes the audio features (under features/MP/DB2) and labels (under annotations/MP/DB2)\u00a0used in\u00a0Coutinho &amp; Schuller (2017).</li>\n\t<li>MP<sub>DB3</sub>:\u00a0This dataset was\u00a0collected by Coutinho &amp; Dibben (2012) and it consists of 8 pieces of film music (84s to 130s long) taken from the late 20th\u00a0century Hollywood film repertoire.\u00a0Emotion ratings were given by 52 participants (26 females).\u00a0The annotation procedure, data processing, and golden standard calculations were identical to MP<sub>DB2</sub>. This repository includes the audio features (under features/MP/DB3) and labels (under annotations/MP/DB3)\u00a0used in\u00a0Coutinho &amp; Schuller (2017).</li>\n\t<li>MP<sub>DB4</sub>:\u00a0This dataset was\u00a0collected by\u00a0Grewe, Nagel, Kopiez and Altenm\u00fcller (2007), and gently made available by the authors.\u00a0It includes seven music pieces (127s to 502s in length) of heterogeneous styles (e.g., Rock, Pop, Heavy Metal, Classical).\u00a0Each music piece was annotated by 38 participants (29 females) using an identical methodology to MP<sub>DB2</sub>\u00a0and\u00a0MP<sub>DB3</sub>. Data processing and golden standard calculations were also identical. This repository includes the audio features (under features/MP/DB4) used in\u00a0Coutinho &amp; Schuller (2017). To obtain the labels contact the authors of the original study</li>\n</ul>\n\n<p>\u00a0</p>\n\n<p><strong>Bibliography</strong></p>\n\n<p>Coutinho, E., &amp; Cangelosi, A. (2011). Musical emotions: predicting second-by-second subjective feelings of emotion from low-level psychoacoustic features and physiological measurements.\u00a0<em>Emotion</em>,\u00a0<em>11</em>(4), 921.</p>\n\n<p>Coutinho, E., &amp; Dibben, N. (2013). Psychoacoustic cues to emotion in speech prosody and music.\u00a0<em>Cognition &amp; Emotion</em>,\u00a0<em>27</em>(4), 658-684.</p>\n\n<p>Coutinho E, Schuller B (2017) Shared acoustic codes underlie emotional communication in music and speech\u2014Evidence from deep transfer learning. PLoS ONE 12(6): e0179289. https://doi. org/10.1371/journal.pone.0179289.</p>\n\n<p>Grewe, O., Nagel, F., Kopiez, R., Altenm\u00fcller, E. (2007). Emotions over time: synchronicity and development of subjective, physiological, and facial affective reactions to music. <em>Emotion, 7</em>(4), pp. 774-788. DOI: 10.1037/1528-3542.7.4.774.</p>\n\n<p>Korhonen, M. (2004). Modeling Continuous Emotional Appraisals of Music Using System Identification. Available from: http://hdl.handle.net/10012/879.</p>\n\n<p>McKeown, G., Valstar, M., Cowie, R., Pantic, M., Schroder, M. (2012). The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent. <em>IEEE Transactions on Affective Computing</em>, 3, pp. 5-17. DOI: http://doi.ieeecomputersociety.org/10.1109/T-AFFC.2011.20.</p>\n\n<p>Ringeval, F., \u00a0Sonderegger, A., Sauer, J. &amp; Lalanne, D. (2013). Introducing the RECOLA Multimodal Corpus of Remote Collaborative and Affective Interactions. In <em>Proceedings of the 2nd International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE 2013)</em>, Shanghai, China. IEEE</p>", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "title": "Shared Acoustic Codes Underlie Emotional Communication in Music and Speech  - Evidence from Deep Transfer Learning (Datasets)", 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "600657"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "345944"
          }
        }
      ]
    }, 
    "keywords": [
      "music, emotion, arousal, valence, time-continuous"
    ], 
    "publication_date": "2017-03-06", 
    "creators": [
      {
        "affiliation": "University of Liverpool", 
        "name": "Coutinho, Eduardo"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.1371/journal.pone.0179289", 
        "relation": "isSupplementTo"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.600657", 
        "relation": "isVersionOf"
      }
    ]
  }
}
684
85
views
downloads
All versions This version
Views 684685
Downloads 8585
Data volume 4.8 GB4.8 GB
Unique views 636637
Unique downloads 5252

Share

Cite as