Published April 5, 2018 | Version 1.0.0
Dataset Open

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

  • 1. University of Wisconsin, River Falls
  • 2. Ryerson University

Description

Citing the RAVDESS

The RAVDESS is released under a Creative Commons Attribution license, so please cite the RAVDESS if it is used in your work in any form.  Published academic papers should use the academic paper citation for our PLoS1 paper.  Personal works, such as machine learning projects/blog posts, should provide a URL to this Zenodo page, though a reference to our PLoS1 paper would also be appreciated.

Academic paper citation

Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391.

Personal use citation

Include a link to this Zenodo page - https://zenodo.org/record/1188976

Commercial Licenses

Commercial licenses for the RAVDESS can be purchased.  For more information, please visit our license fee page, or contact us at ravdess@gmail.com.

Contact Information

If you would like further information about the RAVDESS, to purchase a commercial license, or if you experience any issues downloading files, please contact us at ravdess@gmail.com.

Example Videos

Watch a sample of the RAVDESS speech and song videos.

Emotion Classification Users

If you're interested in using machine learning to classify emotional expressions with the RAVDESS, please see our new RAVDESS Facial Landmark Tracking data set [Zenodo project page].

Construction and Validation

Full details on the construction and perceptual validation of the RAVDESS are described in our PLoS ONE paper - https://doi.org/10.1371/journal.pone.0196391.

The RAVDESS contains 7356 files. Each file was rated 10 times on emotional validity, intensity, and genuineness. Ratings were provided by 247 individuals who were characteristic of untrained adult research participants from North America. A further set of 72 participants provided test-retest data. High levels of emotional validity, interrater reliability, and test-retest intrarater reliability were reported. Validation data is open-access, and can be downloaded along with our paper from PLoS ONE.

Description

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) contains 7356 files (total size: 24.8 GB). The database contains 24 professional actors (12 female, 12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech includes calm, happy, sad, angry, fearful, surprise, and disgust expressions, and song contains calm, happy, sad, angry, and fearful emotions. Each expression is produced at two levels of emotional intensity (normal, strong), with an additional neutral expression. All conditions are available in three modality formats: Audio-only (16bit, 48kHz .wav), Audio-Video (720p H.264, AAC 48kHz, .mp4), and Video-only (no sound).  Note, there are no song files for Actor_18.

Audio-only files

Audio-only files of all actors (01-24) are available as two separate zip files (~200 MB each):

  • Speech file (Audio_Speech_Actors_01-24.zip, 215 MB) contains 1440 files: 60 trials per actor x 24 actors = 1440. 
  • Song file (Audio_Song_Actors_01-24.zip, 198 MB) contains 1012 files: 44 trials per actor x 23 actors = 1012.

Audio-Visual and Video-only files

Video files are provided as separate zip downloads for each actor (01-24, ~500 MB each), and are split into separate speech and song downloads:

  • Speech files (Video_Speech_Actor_01.zip to Video_Speech_Actor_24.zip) collectively contains 2880 files: 60 trials per actor x 2 modalities (AV, VO) x 24 actors = 2880.
  • Song files (Video_Song_Actor_01.zip to Video_Song_Actor_24.zip) collectively contains 2024 files: 44 trials per actor x 2 modalities (AV, VO) x 23 actors = 2024.

File Summary

In total, the RAVDESS collection includes 7356 files (2880+2024+1440+1012 files).

File naming convention

Each of the 7356 RAVDESS files has a unique filename. The filename consists of a 7-part numerical identifier (e.g., 02-01-06-01-02-01-12.mp4). These identifiers define the stimulus characteristics: 

Filename identifiers 

  • Modality (01 = full-AV, 02 = video-only, 03 = audio-only).
  • Vocal channel (01 = speech, 02 = song).
  • Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).
  • Emotional intensity (01 = normal, 02 = strong). NOTE: There is no strong intensity for the 'neutral' emotion.
  • Statement (01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door").
  • Repetition (01 = 1st repetition, 02 = 2nd repetition).
  • Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).


Filename example: 02-01-06-01-02-01-12.mp4 

  1. Video-only (02)
  2. Speech (01)
  3. Fearful (06)
  4. Normal intensity (01)
  5. Statement "dogs" (02)
  6. 1st Repetition (01)
  7. 12th Actor (12)
  8. Female, as the actor ID number is even.

License information

The RAVDESS is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, CC BY-NC-SA 4.0 

Commercial licenses for the RAVDESS can also be purchased.  For more information, please visit our license fee page, or contact us at ravdess@gmail.com.

Related Data sets

Notes

Funding Information Natural Sciences and Engineering Research Council of Canada: 2012-341583 Hear the world research chair in music and emotional speech from Phonak

Files

Audio_Song_Actors_01-24.zip

Files (25.6 GB)

Name Size Download all
md5:5411230427d67a21e18aa4d466e6d1b9
225.5 MB Preview Download
md5:bc696df654c87fed845eb13823edef8a
208.5 MB Preview Download
md5:0f7ba20f3a7278d5a662a7e30f44c942
502.5 MB Preview Download
md5:8e426ef4134abe1dff4d2a8262abe8a9
553.4 MB Preview Download
md5:2d686591f8760f8041d2dc75e8234828
508.7 MB Preview Download
md5:0f1b99c5acff841020546eec1aa32376
482.2 MB Preview Download
md5:9a51cef1d34fb3e9838c0b08b88f9234
529.8 MB Preview Download
md5:a8d06eac834dd9835d208444ac17c744
518.1 MB Preview Download
md5:3c28c0ef1a70a625ae3b7168773c19bc
503.6 MB Preview Download
md5:9a71f2c8e6d19678a4cc2f0ba79e532a
525.8 MB Preview Download
md5:b4b482f6a984b540def54a9ac5d0b1b2
507.4 MB Preview Download
md5:8bcfdcad55a12ff1b1da24082ff57819
540.4 MB Preview Download
md5:39d6d954d3d779c4c5ac3ca95f99b284
482.1 MB Preview Download
md5:818b8ce6463c42f56fa7df5a23eb15aa
488.6 MB Preview Download
md5:0311dd65a9d9dabec72a4482b6f1ea55
493.2 MB Preview Download
md5:6c4a681c132c905cc8dff38d85117b37
477.1 MB Preview Download
md5:7532b9ea79552db8a28a33bdf6b79d80
470.4 MB Preview Download
md5:9f68766c1dc14222069c92da83db4875
524.5 MB Preview Download
md5:7e0980984399183ca3d38843ffc00897
548.7 MB Preview Download
md5:ecb411e08834305e64e11a4fb4aed741
508.7 MB Preview Download
md5:21650223a74feeafdc54bf3e647a77a6
533.5 MB Preview Download
md5:c926bb0244a4d4bf534eaca02c627e36
507.3 MB Preview Download
md5:bcde19cd5460662c207293ae94f415f9
546.9 MB Preview Download
md5:97772a15b0ba86177afc94c648d026af
527.2 MB Preview Download
md5:bfb25878b311c70079ce941bef04cd38
528.6 MB Preview Download
md5:3c8ececaf392b4a9b11b32271f4f6d01
553.0 MB Preview Download
md5:a6f40d413b2e6ef25b3a595099e59abb
570.7 MB Preview Download
md5:68fa240ddd8a3cc410c64efe5fb2b0e4
567.8 MB Preview Download
md5:013eed832af1f7e97082aae398a00dbe
546.6 MB Preview Download
md5:b00e17d61374ef6ba86fef35d50a20fb
563.0 MB Preview Download
md5:3c42877921cc08cfb5c841a0f2cb94a7
568.5 MB Preview Download
md5:bdd29bfe082f80361ec6b589845c283d
565.7 MB Preview Download
md5:bed5f43d9c18e177e34d4bc1ed9b6d77
562.6 MB Preview Download
md5:775da349c50bf915a1bbcb37379bd092
523.5 MB Preview Download
md5:70199d1c6902f76e17df308d5c41fa01
565.6 MB Preview Download
md5:824504aa41a8fd575e5459d5701dd378
518.6 MB Preview Download
md5:2a1f0ddc0ca207beedee6ce2ea863ad7
557.9 MB Preview Download
md5:c5f2fd4fd77941636947620368c612b5
501.9 MB Preview Download
md5:466d31f4cce92a14b1f4ee884bbdf7d7
552.5 MB Preview Download
md5:4b3b9bc8473e86f884630f2506684887
525.2 MB Preview Download
md5:a63c54570ebbe6bd616d0dced5630f7e
562.4 MB Preview Download
md5:8b7ae3e5a85d2874ac8e1195e722e047
551.7 MB Preview Download
md5:fef2131ac361da182c0cb85f3d8e8a6c
565.7 MB Preview Download
md5:abb97d97d8c4e88c866cee1cea9d06d6
581.1 MB Preview Download
md5:5ddfb22770093bafebdefe8faa449f48
561.3 MB Preview Download
md5:d810cf2ff6863ef91306fc7a16d74fbb
590.3 MB Preview Download
md5:fee55f72c9871401f23d4b167de3217d
560.9 MB Preview Download
md5:b31e8c8904e66102e7a951694db752d6
545.2 MB Preview Download
md5:eef90e806c9179bc5b4b098d03647ae3
595.8 MB Preview Download

Additional details

Related works

Is cited by
10.1371/journal.pone.0196391 (DOI)
Is referenced by
10.5281/zenodo.3255102 (DOI)

References

  • Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391