Spatiotemporal Action Recognition in Videos Using ConvLSTMwith Attention: A Comparative Analysis and Implementation
Creators
Description
This dissertation explores the application of Convolutional Long Short-Term Memory
(ConvLSTM) networks for video action recognition. ConvLSTM integrates the spatial
feature extraction capabilities of CNNs with the temporal modeling strengths of LSTM
networks, enabling the model to capture both spatial and temporal dependencies within
video data. The primary goal of this research is to develop an efficient, scalable model
capable of recognizing actions in real-time. The UCF-101 dataset is used to evaluate the
effectiveness of the proposed model, with performance compared against traditional CNN
and LSTM approaches. Additionally, various preprocessing techniques and hyperparameter
configurations are examined to understand their impact on model performance.
Files
Research.pdf
Files
(1.5 MB)
Name | Size | Download all |
---|---|---|
md5:5261de9e9c16c19bbef444c6bb67ab5f
|
1.5 MB | Preview Download |
Additional details
Related works
- Cites
- Publication: arXiv:1705.07750 (arXiv)
Dates
- Available
-
2025-08
Software
- Repository URL
- https://github.com/Kaif10/Action-Recognition-in-Videos