Journal article Open Access

Visual Interestingness Prediction: A Benchmark Framework and Literature Review

Mihai Gabriel Constantin; Liviu-Daniel Ştefan; Bogdan Ionescu; Ngoc Q. K. Duong; Claire-Héléne Demarty; Mats Sjöberg

In this paper, we report on the creation of a publicly available, common evaluation framework for image and video visual interestingness prediction. We propose a robust data set, the Interestingness10k, with 9831 images and more than 4 h of video, interestigness scores determined based on more than 1M pair-wise annotations of 800 trusted annotators, some pre-computed multi-modal descriptors, and 192 system output results as baselines. The data were validated extensively during the 2016–2017 MediaEval benchmark campaigns. We provide an in-depth analysis of the crucial components of visual interestingness prediction algorithms by reviewing the capabilities and the evolution of the MediaEval benchmark systems, as well as of prominent systems from the literature. We discuss overall trends, influence of the employed features and techniques, generalization capabilities and the reliability of results. We also discuss the possibility of going beyond state-of-the-art performance via an automatic, ad-hoc system fusion, and propose a deep MLP-based architecture that outperforms the current state-of-the-art systems by a large margin. Finally, we provide the most important lessons learned and insights gained.

Files (18.5 MB)
Name Size
18.5 MB Download
Views 203
Downloads 266
Data volume 4.9 GB
Unique views 178
Unique downloads 251


Cite as