Using Vision Transformers and Memorable Moments for the Prediction of Video Memorability

doi:10.5281/zenodo.6366900

AI4Media H2020 Project

Published December 20, 2021 | Version v1

Conference paper Open

Using Vision Transformers and Memorable Moments for the Prediction of Video Memorability

1. University Politehnica of Bucharest, Romania

This paper describes the approach taken by the AI Multimedia Lab team for the MediaEval 2021 Predicting Media Memorability task. Our approach is based on a Vision Transformer-based learning method, which is optimized by filtering the training sets for the two proposed datasets.We attempt to train the methods we propose with video segments that are more representative for the videos they are part of. We test several types of filtering architectures, and submit and test the architectures that best performed in our preliminary studies.

Files

ME2021MemorabilityMethod.pdf

Files (403.0 kB)

Name	Size	Download all
ME2021MemorabilityMethod.pdf md5:2a24fd9e9a13b9aa7bfbc4f3853a96d8	403.0 kB	Preview Download

Additional details

AI4Media – A European Excellence Centre for Media, Society and Democracy 951911: European Commission

Views

Downloads

Show more details

	All versions	This version
Views	80	80
Downloads	61	61
Data volume	25.8 MB	25.8 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: March 18, 2022
Modified: March 19, 2022

Using Vision Transformers and Memorable Moments for the Prediction of Video Memorability

Creators

Description

Files

ME2021MemorabilityMethod.pdf

Files (403.0 kB)

Additional details

Funding