Aircraft Marshaling Signals Dataset of FMCW Radar and Event-Based Camera for Sensor Fusion
Creators
Description
Dataset Introduction
The advent of neural networks capable of learning salient features from variance in the radar data has expanded the breadth of radar applications, often as an alternative sensor or a complementary modality to camera vision. Gesture recognition for command control is arguably the most commonly explored application. Nevertheless, more suitable benchmarking datasets than currently available are needed to assess and compare the merits of the different proposed solutions and explore a broader range of scenarios than simple hand-gesturing a few centimeters away from a radar transmitter/receiver. Most current publicly available radar datasets used in gesture recognition provide limited diversity, do not provide access to raw ADC data, and are not significantly challenging. To address these shortcomings, we created and make available a new dataset that combines FMCW radar and dynamic vision camera of 10 aircraft marshalling signals (whole body) at several distances and angles from the sensors, recorded from 13 people. The two modalities are hardware synchronized using the radar's PRI signal. Moreover, in the supporting publication we propose a sparse encoding of the time domain (ADC) signals that achieve a dramatic data rate reduction (>76%) while retaining the efficacy of the downstream FFT processing (<2% accuracy loss on recognition tasks), and can be used to create an sparse event-based representation of the radar data. In this way the dataset can be used as a two-modality neuromorphic dataset.
Synchronization of the two modalities
The PRI pulses from the radar have been hard-wired to the event stream of the DVS sensor, and timestamped using the DVS clock. Based on this signal the DVS event stream has been segmented such that groups of events (time-bins) of the DVS are mapped with individual radar pulses (chirps).
Data storage
DVS events (x,y coords and timestamps) are stored in structured arrays, and one such structured array object is associated with the data of a radar transmission (pulse/chirp). A radar transmission is a vector of 512 ADC levels that correspond to sampling points of chirping signal (FMCW radar) that lasts about ~1.3ms. Every 192 radar transmissions are stacked in a matrix called a radar frame (each transmission is a row in that matrix). A data capture (recording) consisting of some thousands of continuous radar transmissions is therefore segmented in a number of radar frames. Finally radar frames and the corresponding DVS structured arrays are stored in separate containers in a custom-made multi-container file format (extension .rad). We provide a (rad file) parser for extracting the data out of these files. There is one file per capture of continuous gesture recording of about 10s.
Note the number of 192 transmissions per radar frame is an ad-hoc segmentation that suits the purpose of obtaining sufficient signal resolution in a 2D FFT typical in radar signal processing, for the range resolution of the specific radar. It also served the purpose of fast streaming storing of the data during capture. For extracting individual data points for the dataset however, one can pool together (concat) all the radar frames from a single capture file and re-segment them according to liking. The data loader that we provide offers this, with a default of re-segmenting every 769 transmissions (about 1s of gesturing).
Data captures directory organization (radar8Ghz-DVS-marshaling_signals_20220901_publication_anonymized.7z)
The dataset captures (recordings) are organized in a common directory structure which encompasses additional metadata information about the captures.
dataset_dir/<stage>/<room>/<person>-<gesture>-<distance>/ofxRadar8Ghz_yyyy-mm-dd_HH-MM-SS.rad
Identifiers
- stage [train, test].
- room: [conference_room, foyer, open_space].
- subject: [0-9]. Note that 0 stands for no person, and 1 for an unlabeled, random person (only present in test).
- gesture: ['none', 'emergency_stop', 'move_ahead', 'move_back_v1', 'move_back_v2', 'slow_down' 'start_engines', 'stop_engines', 'straight_ahead', 'turn_left', 'turn_right'].
- distance: ['xxx', '100', '150', '200', '250', '300', '350', '400', '450'] (in cm). Note that xxx is used for none gestures when there is no person present in front of the radar (i.e. background samples), or when a person is walking in front of the radar with varying distances but performing no gesture.
The test data captures contain both subjects that appear in the train data as well as previously unseen subjects. Similarly the test data contain captures from the spaces that train data were recorded at, as well as from a new unseen open space.
Files List
radar8Ghz-DVS-marshaling_signals_20220901_publication_anonymized.7z
This is the actual archive bundle with the data captures (recordings).
Parser for individual .rad files, which contain capture data.
A convenience PyTorch Dataset loader (partly Tonic compatible). You practically only need this to quick-start if you don't want to delve too much into code reading. When you init a DvsRadarAircraftMarshallingSignals class object it automatically downloads the dataset archive and the .rad file parser, unpacks the archive, and imports the .rad parser to load the data. One can then request from it a training set, a validation set and a test set as torch.Datasets to work with.
aircraft_marshalling_signals_howto.ipynb
Jupyter notebook for exemplary basic use of loader.py
Contact
For further information or questions try contacting first M. Sifalakis or F. Corradi.
Notes
Files
aircraft_marshalling_signals_howto.ipynb
Additional details
Identifiers
Dates
- Created
-
2023-05-01
- Updated
-
2023-11-30(bugfix in file parser and pytorch dataset loader)