Multimodal Sentiment Analysis in Car Reviews dataset - raw data (MuSe-CAR raw)
Description
General: The purpose of the Multimodal Sentiment Analysis in Real-life media Challenge (MuSe) is to bring together communities from different disciplines; mainly, the audio-visual emotion recognition community (signal-based), and the sentiment analysis community (symbol-based).
We introduce the novel dataset MuSe-CAR that covers the range of aforementioned desiderata. MuSe-CAR is a large (>36h), multimodal dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild, e.g., the emotional engagement that takes place during product reviews (i.e., automobile reviews) where a sentiment is linked to a topic or entity.
We have designed MuSe-CAR to be of high voice and video quality, as informative video social media content, as well as everyday recording devices have improved in recent years. This enables robust learning, even with a high degree of novel, in-the-wild characteristics, for example as related to: i) Video: Shot size (a mix of closeup, medium, and long shots), face-angle (side, eye, low, high), camera motion (free, free but stable, and free but unstable, switch, e.g., zoom, fixed), reviewer visibility (full body, half-body, face only, and hands only), highly varying backgrounds, and people interacting with objects (car parts). ii) Audio: Ambient noises (car noises, music), narrator and host diarisation, diverse microphone types, and speaker locations. iii) Text: Colloquialisms, and domain-specific terms.
Notes
Files
Additional details
Related works
- References
- Preprint: 10.1145/3423327.3423673 (DOI)