Fusing Events and Frames with Self-Attention Network for Ball Collision Detection

Bonazzi, Pietro

doi:10.5281/zenodo.15166553

Published April 7, 2025 | Version v1

Model Open

Fusing Events and Frames with Self-Attention Network for Ball Collision Detection

Bonazzi, Pietro¹

1. ETH Zurich

Ensuring robust and real-time obstacle avoidance is critical for the safe operation of autonomous robots in dynamic, real-world environments.
This paper proposes a neural network framework for predicting the collision position and time of an unmanned aerial vehicle with a dynamic object, using only RGB and event-based vision sensors.
The proposed architecture consists of two separate encoder branches, one for each modality, followed by fusion by self-attention to improve prediction accuracy.
To facilitate benchmarking, we introduce a multi-modal dataset that enables detailed comparisons of single-modality and fusion-based approaches.
At the same prediction throughput of 50Hz, the experimental results show that the fusion-based model offers an improvement in prediction accuracy over single-modality approaches of 1\% on average and 10\% for distances beyond 0.5m, but comes at the cost of +71\% in memory and + 105\% in FLOPs. Notably, the event-based model outperforms the RGB model by 4\% for position and 26\% for time error at a similar computational cost, making it a competitive alternative.
Additionally, we evaluate quantized versions of the event-based models, applying 1- to 8-bit quantization to assess the trade-offs between predictive performance and computational efficiency.
These findings highlight the potential of multi-modal perception using RGB and event-based cameras in robotic applications.

Files

models.zip

Files (132.0 MB)

Name	Size	Download all
models.zip md5:b3ed4d48b23fb5ae9a3c46c733a15ac5	132.0 MB	Preview Download

Additional details

Alternative title: Towards Low-Latency Event-based Obstacle Avoidance on a FPGA-Drone

Accepted: 2025-06

	All versions	This version
Views	38	38
Downloads	31	31
Data volume	4.1 GB	4.1 GB

Fusing Events and Frames with Self-Attention Network for Ball Collision Detection

Files

models.zip

Files (132.0 MB)

Additional details

Additional titles

Dates

Fusing Events and Frames with Self-Attention Network for Ball Collision Detection

Creators

Description

Files

models.zip

Files (132.0 MB)

Additional details

Additional titles

Dates