Kordopatis-Zilos, Giorgos
Papadopoulos, Symeon
Patras, Ioannis
Kompatsiaris, Yiannis
2016-12-31
<p>The problem of Near-Duplicate Video Retrieval (NDVR) has attracted increasing interest due to the huge growth of video content on the Web, which is characterized by high degree of near duplicity. This calls for efficient NDVR approaches. Motivated by the outstanding performance of <em>Convolutional Neural Networks</em> (CNNs) over a wide variety of computer vision problems, we leverage intermediate CNN features in a novel global video representation by means of a layer-based feature aggregation scheme. We perform extensive experiments on the widely used CC_WEB_VIDEO dataset, evaluating three popular deep architectures (AlexNet, VGGNet, GoogLeNet) and demonstrating that the proposed approach exhibits superior performance over the state-of-the-art, achieving a mean Average Precision (mAP) score of 0.976.</p>
https://doi.org/10.1007/978-3-319-51811-4_21
oai:zenodo.org:240645
Zenodo
https://zenodo.org/communities/invid-h2020
https://zenodo.org/communities/eu
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
MMM2017, 23rd International Conference on Multimedia Modeling, Reykjavik, Iceland, January 4-6, 2017
Near-duplicate
Video retrieval
CNNs
Bag of keyframes
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
info:eu-repo/semantics/conferencePaper