Conference paper Open Access
Troncy, Raphaël; Laaksonen, Jorma; Tavakoli, Hamed; Nixon, Lyndon; Mezaris, Vasileios; Hosseini, Mohammad
Technological developments in comprehensive video understanding – detecting and identifying visual elements of a scene, combined with audio understanding (music, speech), as well as aligned with textual information such as captions, subtitles, etc. and background knowledge – have been undergoing a significant revolution during recent years. The workshop brings together experts from academia and industry in order to discuss the latest progress in artificial intelligence research in topics related to multimodal information analysis, and in particular, semantic analysis of video, audio, and textual information for smart digital TV content production, access and delivery.