Published August 8, 2024 | Version v1
Journal Open

DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster Recognition

  • 1. KIOS Research and Innovation Center of Excellence, University of Cyprus

Description

Abstract:

The integration of Unmanned Aerial Vehicles (UAVs) for disaster assessment via aerial imagery necessitates models that demonstrate exceptional accuracy, computational efficiency, and real-time processing capabilities. Traditional Convolutional Neural Networks (CNNs), demonstrate efficiency in local feature extraction but are limited by their potential for global context interpretation. On the other hand, Vision Transformers (ViTs) show promise for improved global context interpretation through the use of attention mechanisms, although they still remain underinvestigated in UAV-based disaster response applications. Bridging this research gap, we introduce the DiRecNetV2, a novel hybrid model that merges the strengths of CNNs and ViTs. It combines the robust feature extraction of CNNs with the efficient global context understanding of ViTs, maintaining a low computational load ideal for UAV applications. Additionally, we introduce a new, compact multi-label dataset of disasters, to set an initial benchmark for future research, exploring how models trained on single-label data perform in a multi-label test set. 
The study assesses lightweight CNNs and ViTs on the AIDERSv2 dataset, based on the frames per second (FPS) for efficiency and the weighted F1 scores for classification performance. DiRecNetV2 not only achieves a weighted F1 score of 0.964 on a single-label test set but also demonstrates adaptability, with a score of 0.614 on a complex multi-label test set, while functioning at 176.13 FPS on the Nvidia Orin jetson device. The study's results point to a promising direction for future research: a hybrid approach that combines the global context capabilities of Vision Transformers (ViTs) with the robust feature representation maps of CNNs has the potential to achieve high accuracy and could emerge as a mainstream architectural choice in the field.

 

Cite:

Shianios, D., Kolios, P.S. & Kyrkou, C. DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster Recognition. SN COMPUT. SCI. 5, 770 (2024).

 

AIDERv2 dataset:

https://zenodo.org/records/10891054

 

 

 

Notes

This version of the manuscript has been accepted for publication in SN Computer Science journal after peer review (Author Accepted Manuscript). It is not the final published version (Version of Record) and does not reflect any post-acceptance improvements. The Version of Record is available online at https://doi.org/10.1007/s42979-024-03066-y

Files

manuscript.pdf

Files (101.4 MB)

Name Size Download all
md5:2a9bc77eff51fe9802d13d4fdee3f704
101.4 MB Preview Download

Additional details

Related works

Compiles
Image: 10.5281/zenodo.10891054 (DOI)

Funding

European Commission
KIOS CoE - KIOS Research and Innovation Centre of Excellence 739551

Dates

Submitted
2024-03-03
Accepted
2024-06-03