Published December 20, 2021
| Version v1
Conference paper
Open
Using Vision Transformers and Memorable Moments for the Prediction of Video Memorability
Description
This paper describes the approach taken by the AI Multimedia Lab team for the MediaEval 2021 Predicting Media Memorability task. Our approach is based on a Vision Transformer-based learning method, which is optimized by filtering the training sets for the two proposed datasets.We attempt to train the methods we propose with video segments that are more representative for the videos they are part of. We test several types of filtering architectures, and submit and test the architectures that best performed in our preliminary studies.
Files
ME2021MemorabilityMethod.pdf
Files
(403.0 kB)
Name | Size | Download all |
---|---|---|
md5:2a24fd9e9a13b9aa7bfbc4f3853a96d8
|
403.0 kB | Preview Download |