Lightweight Multi-Modal AI Framework for Face-Swap Deepfake Detection

Sunayana P. Singh; Tanvi Muttin; T. Mokshita Seshu; Vagarth Pandey; Deepak N A

doi:10.5281/zenodo.18252663

Published January 15, 2026 | Version v1

Journal article Open

Lightweight Multi-Modal AI Framework for Face-Swap Deepfake Detection

Face-swap deepfakes have risen in fidelity and accessibility, posing growing threats to personal privacy, identity integrity, and public trust in digital media. The sophistication of modern generative models allows manipulated content to bypass casual human observation and even deceive conventional automated detectors. This growing realism demands robust, transparent, and computationally efficient detection systems. We propose a lightweight, multi-modal AI framework that fuses spatial (CNN), temporal (LSTM/GRU), frequency (DCT/FFT), and audio modalities through an attention-based fusion mechanism to identify face-swap deepfake videos. The framework is designed not only for detection accuracy but also for real-world deployability—leveraging key-frame extraction and compact neural backbones to operate effectively on constrained hardware. In addition, explainability is prioritized through visualization tools such as Grad-CAM, integrated gradients, and modalitylevel confidence reporting to enhance forensic interpretability. Our work bridges the gap between high-performance academic models and practical field applications by focusing on modular design, reproducible experimentation, and cross-dataset generalization. The resulting system aims to support real-time media verification pipelines, assist investigators in forensic reporting, and promote public resilience against synthetic media threats. Overall, the framework lays the foundation for transparent, efficient, and responsible deepfake detection in the evolving landscape of generative AI.

Files

Lightweight Multi-Modal AI Framework.pdf

Files (841.6 kB)

Name	Size	Download all
Lightweight Multi-Modal AI Framework.pdf md5:d51bcef5398cc1042f676ad25c240008	841.6 kB	Preview Download

Additional details

1. F. Abbas and A. Taeihagh, "Unmasking deepfakes: A systematic review of deepfake detection and generation techniques using artificial intelligence," Expert Systems with Applications, vol. 252, Part B, Art. no. 124260, 2024.
2. N. Heidari, J. Jafari Navimipour, H. Dag, and M. Unal, "Deepfake detection using deep learning methods: A systematic and comprehensive review," Expert Systems with Applications, 2024.
3. L. A. Passos Ju´nior et al., "A Review of Deep Learning-based Approaches for Deepfake Content Detection," 2023.
4. T. Fernando, D. Priyasad, S. Sridharan, A. Ross, and C. Fookes, "Face Deepfakes – A Comprehensive Review," IEEE Transactions on Technology and Society, 2024.
5. A. H. Soudy et al., "Deepfake detection using convolutional vision transformers and convolutional neural networks," Neural Computing and Applications, vol. 36, pp. 19759–19775, 2024.
6. D. Wodajo, P. Lambert, G. Van Wallendael, S. Atnafu, and H. Mareen, "Improved Deepfake Video Detection Using Convolutional Vision Transformer," IEEE GEM, pp. 1–6, 2024.
7. A. Kaur et al., "Deepfake video detection: challenges and opportunities," Artificial Intelligence Review, vol. 57, p. 159, 2024.
8. H. Lin, W. Huang, W. Luo, and W. Lu, "DeepFake detection with multiscale convolution and vision transformer," Digital Signal Processing, vol. 134, 2023, Art. no. 103895.
9. F. Liu et al., "A novel deepfake video detection method based on frequency analysis," 2022.
10. T. P. Nagarhalli et al., "A Comprehensive Review of Deepfake and Its Detection Techniques," 2024.

	All versions	This version
Views	44	44
Downloads	6	6
Data volume	6.7 MB	6.7 MB

Lightweight Multi-Modal AI Framework for Face-Swap Deepfake Detection

Authors/Creators

Description

Files

Lightweight Multi-Modal AI Framework.pdf

Files (841.6 kB)

Additional details

References