ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

Gkalelis, Nikolaos; Goulas, Andreas; Galanopoulos, Damianos; Mezaris, Vasileios

doi:10.5281/zenodo.4963588

Published June 16, 2021 | Version v1

Conference paper Open

ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

1. CERTH-ITI

In this paper a novel bottom-up video event recognition approach is proposed, ObjectGraphs, which utilizes a rich frame representation and the relations between objects within each frame. Following the application of an object detector (OD) on the frames, graphs are used to model the object relations and a graph convolutional network (GCN) is utilized to perform reasoning on the graphs. The resulting object-based frame-level features are then forwarded to a long short-term memory (LSTM) network for video event recognition. Moreover, the weighted in-degrees (WiDs) derived from the graph’s adjacency matrix at frame level are used for identifying the objects that were considered most (or least) salient for event recognition and contributed the most (or least) to the final event recognition decision, thus providing an explanation for the latter. The experimental results show that the proposed method achieves state-of-the-art performance on the publicly available FCVID and YLI-MED datasets. Source code for our ObjectGraphs method is made publicly available at: https://github.com/bmezaris/ObjectGraphs

Files

camready_with_header.pdf

Files (4.2 MB)

Name	Size	Download all
camready_with_header.pdf md5:0852b6e050198424a8dad21389401c60	4.2 MB	Preview Download

Additional details

European Commission
MIRROR – Migration-Related Risks caused by misconceptions of Opportunities and Requirement 832921
European Commission
AI4Media – A European Excellence Centre for Media, Society and Democracy 951911

	All versions	This version
Views	142	141
Downloads	199	196
Data volume	859.2 MB	846.7 MB

ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

Creators

Description

Files

camready_with_header.pdf

Files (4.2 MB)

Additional details

Funding