Conference paper Open Access
Ilias Gialampoukidis;
Anastasia Moumtzidou;
Stefanos Vrochidis;
Ioannis Kompatsiaris
Multimedia collections are ubiquitous and very often contain hundreds of hours of video information. The retrieval of a particular scene of a video (Known Item Search) in a large collection is a difficult problem, considering the multimodal character of all video shots and the complexity of the query, either visual or textual. We tackle these challenges by fusing, first, multiple modalities in a nonlinear graph-based way for each subtopic of the query. In addition, we fuse the top retrieved video shots per sub-query to provide the final list of retrieved shots, which is then re-ranked using temporal information. The framework is evaluated in popular Known Item Search tasks in the context of video shot retrieval and provides the largest Mean Reciprocal Rank scores.
Name | Size | |
---|---|---|
2018-ieee-image_.pdf
md5:1ccd838dd16ecf6bae4f72372495cba1 |
673.7 kB | Download |
Views | 171 |
Downloads | 115 |
Data volume | 77.5 MB |
Unique views | 153 |
Unique downloads | 114 |