Published September 1, 2025
| Version v1
Journal article
Open
Enhancing Car Safety with Multimodal Emotion Recognition using CNN-LSTM Networks
Authors/Creators
- 1. Computer Engineering Department, MKSSS's Cummins College of Engineering for Women,, India
Description
Aggressive driving behaviors caused by emotional impairments such as anger, stress, and fatigue contribute significantly to traffic accidents worldwide. Existing single-modal emotion recognition systems fail to capture the full complexity of human emotional states, particularly when different modalities convey conflicting signals, limiting their effectiveness in real-world driving scenarios.
This study aims to enhance automotive safety by developing a robust real-time multimodal emotion recognition system that integrates visual and auditory cues to accurately detect driver emotional states and trigger appropriate safety interventions.
We developed a hybrid CNN-LSTM model that processes facial expressions through Convolutional Neural Networks (CNNs) for spatial feature extraction and speech patterns through Long Short-Term Memory (LSTM) networks for temporal sequence analysis. The system employs decision-level fusion to integrate multimodal data from the RAVDESS dataset (7,356 files, 24 actors, balanced gender distribution, 8 emotions based on Ekman's model: anger, calm, neutral, surprise, disgust, sadness, fear, happiness). A 2-second time window with 60 frames per sequence was used for temporal modeling, with evaluation conducted using 70-30 train-test split and 5-fold cross-validation.
The proposed model achieved 98.28% accuracy, 98.77% precision, and real-time processing at ~22.5 FPS on NVIDIA Jetson Xavier NX embedded systems, significantly outperforming traditional machine learning approaches (SVM: 37.33%) and competitive with Transformer-based models. The system demonstrated robust performance including 10% facial occlusion and 20dB background noise.
The hybrid CNN-LSTM framework successfully addresses the limitations of single-modal systems by providing accurate, real-time emotion recognition suitable for integration with Advanced Driver Assistance Systems (ADAS). The system can trigger safety measures including speed limiters, contributing to enhanced road safety through proactive emotional state monitoring.
Notes
Files
p1545-1563.pdf
Files
(2.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:332ee684a49a4295fdcdda1b55c50259
|
2.3 MB | Preview Download |
Additional details
Related works
- Is identical to
- Journal article: 10.5109/7388848 (DOI)
- Is supplemented by
- Other: https://citation.crossref.org/?doi=10.5109/7388848 (URL)