Spatiotemporal Action Recognition in Videos Using ConvLSTMwith Attention: A Comparative Analysis and Implementation

Kohari, Mohammed Kaif

doi:10.5281/zenodo.16738167

Published August 4, 2025 | Version v1

Dissertation Open

Spatiotemporal Action Recognition in Videos Using ConvLSTMwith Attention: A Comparative Analysis and Implementation

Kohari, Mohammed Kaif

This dissertation explores the application of Convolutional Long Short-Term Memory
(ConvLSTM) networks for video action recognition. ConvLSTM integrates the spatial
feature extraction capabilities of CNNs with the temporal modeling strengths of LSTM
networks, enabling the model to capture both spatial and temporal dependencies within
video data. The primary goal of this research is to develop an efficient, scalable model
capable of recognizing actions in real-time. The UCF-101 dataset is used to evaluate the
effectiveness of the proposed model, with performance compared against traditional CNN
and LSTM approaches. Additionally, various preprocessing techniques and hyperparameter
configurations are examined to understand their impact on model performance.

Files

Research.pdf

Files (1.5 MB)

Name	Size	Download all
Research.pdf md5:5261de9e9c16c19bbef444c6bb67ab5f	1.5 MB	Preview Download

Additional details

Cites: Publication: arXiv:1705.07750 (arXiv)

Available: 2025-08

Repository URL: https://github.com/Kaif10/Action-Recognition-in-Videos

	All versions	This version
Views	145	145
Downloads	20	20
Data volume	43.3 MB	43.3 MB

Spatiotemporal Action Recognition in Videos Using ConvLSTMwith Attention: A Comparative Analysis and Implementation

Files

Research.pdf

Files (1.5 MB)

Additional details

Related works

Dates

Software

Spatiotemporal Action Recognition in Videos Using ConvLSTMwith Attention: A Comparative Analysis and Implementation

Creators

Description

Files

Research.pdf

Files (1.5 MB)

Additional details

Related works

Dates

Software