MuSe-Trust: Multimodal Trustworthiness Sub-challenge (MuSe2020)
Description
MuSe-Trust of MuSe2020: Predicting the level of trustworthiness of user-generated audio-visual content in a sequential manner utilising a diverse range of features and (optional) emotional (arousal and valence) predictions. This package includes only MuSe-Trust features (all partitions) and annotations of the training and development set (test scoring via the MuSe website).
General: The purpose of the Multimodal Sentiment Analysis in Real-life media Challenge and Workshop (MuSe) is to bring together communities from different disciplines; mainly, the audio-visual emotion recognition community (signal-based), and the sentiment analysis community (symbol-based).
We introduce the novel dataset MuSe-CAR that covers the range of aforementioned desiderata. MuSe-CAR is a large (>36h), multimodal dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild, e.g., the emotional engagement that takes place during product reviews (i.e., automobile reviews) where a sentiment is linked to a topic or entity.
We have designed MuSe-CAR to be of high voice and video quality, as informative video social media content, as well as everyday recording devices have improved in recent years. This enables robust learning, even with a high degree of novel, in-the-wild characteristics, for example as related to: i) Video: Shot size (a mix of closeup, medium, and long shots), face-angle (side, eye, low, high), camera motion (free, free but stable, and free but unstable, switch, e.g., zoom, fixed), reviewer visibility (full body, half-body, face only, and hands only), highly varying backgrounds, and people interacting with objects (car parts). ii) Audio: Ambient noises (car noises, music), narrator and host diarisation, diverse microphone types, and speaker locations. iii) Text: Colloquialisms, and domain-specific terms.
Notes
Files
Additional details
Related works
- References
- Preprint: 10.1145/3423327.3423673 (DOI)