Published June 8, 2026 | Version v1

3D-Net Dataset

  • 1. ROR icon Universität Innsbruck

Description

The 3D-Net dataset is a paired RGB and long-wave infrared (LWIR) benchmark for UAV-based vehicle detection. Data were collected using a Workswell WIRIS Enterprise multi-sensor camera mounted on a twinFOLD KAT hexacopter at approximately 30 m altitude over a road in Ampass, Austria, across three recording sessions covering eight scenes at varying oblique viewing angles.

The dataset comprises 12,263 pixel-aligned RGB-thermal image pairs at 852×672 pixels with 9,060 bounding boxes across six classes: car, truck, bus, motorcycle, bicycle, and person. As data originate from continuous video recordings, the splits are assigned chronologically per scene: the first 70% of frames form the training set, the next 15% the validation set, and the final 15% the test set. Spatial alignment between modalities is computed via an affine transformation correcting for field-of-view differences and boresight error.

Files

3d_net_rgbt.zip

Files (4.0 GB)

Name Size
md5:e82fc9c8c52c1aafdbf69466031eac71
4.0 GB Preview Download