Published December 20, 2021 | Version v1
Conference paper Open

Using Vision Transformers and Memorable Moments for the Prediction of Video Memorability

  • 1. University Politehnica of Bucharest, Romania

Description

This paper describes the approach taken by the AI Multimedia Lab team for the MediaEval 2021 Predicting Media Memorability task. Our approach is based on a Vision Transformer-based learning method, which is optimized by filtering the training sets for the two proposed datasets.We attempt to train the methods we propose with video segments that are more representative for the videos they are part of. We test several types of filtering architectures, and submit and test the architectures that best performed in our preliminary studies.

Files

ME2021MemorabilityMethod.pdf

Files (403.0 kB)

Name Size Download all
md5:2a24fd9e9a13b9aa7bfbc4f3853a96d8
403.0 kB Preview Download

Additional details

Funding

AI4Media – A European Excellence Centre for Media, Society and Democracy 951911
European Commission