Conference paper Open Access
Buccoli, Michele; Di Giorgi, Bruno; Zanoni, Massimiliano; Antonacci, Fabio; Sarti, Augusto
Motion analysis and tracking often relies on multimodal signals, e.g., video, depth map, motion capture (MoCap), due to the completeness of information they jointly provide. The joint analysis of multimodal signals requires to know the correct timing, i.e., the signals to be aligned. In this paper we propose an approach to automatically estimate the correct matching and alignment between a video and a MoCap recording acquired from the same session, based on the multi-dimensional correlation of velocity-based features extracted from the two recordings. We validate our approach over a dataset of dance recordings of four genres, and we achieve promising results for both the alignment and matching scenarios.