Published October 18, 2024 | Version 1.0.1
Dataset Open

Wallhack1.8k Dataset | Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition

  • 1. Computer Vision Lab, TU Wien

Description

This repository contains the Wallhack1.8k dataset for WiFi-based long-range activity recognition in Line-of-Sight (LoS) and Non-Line-of-Sight (NLoS)/Through-Wall scenarios, as proposed in [1,2], as well as the CAD models (of 3D-printable parts) of the WiFi systems proposed in [2].

PyTroch Dataloader

A minimal PyTorch dataloader for the Wallhack1.8k dataset is provided at: https://github.com/StrohmayerJ/wallhack1.8k

Dataset Description

The Wallhack1.8k dataset comprises 1,806 CSI amplitude spectrograms (and raw WiFi packet time series) corresponding to three activity classes: "no presence," "walking," and "walking + arm-waving." WiFi packets were transmitted at a frequency of 100 Hz, and each spectrogram captures a temporal context of approximately 4 seconds (400 WiFi packets).

To assess cross-scenario and cross-system generalization, WiFi packet sequences were collected in LoS and through-wall (NLoS) scenarios, utilizing two different WiFi systems (BQ: biquad antenna and PIFA: printed inverted-F antenna). The dataset is structured accordingly:

  • LOS/BQ/ <- WiFi packets collected in the LoS scenario using the BQ system
  • LOS/PIFA/ <- WiFi packets collected in the LoS scenario using the PIFA system
  • NLOS/BQ/ <- WiFi packets collected in the NLoS scenario using the BQ system
  • NLOS/PIFA/ <- WiFi packets collected in the NLoS scenario using the PIFA system

These directories contain the raw WiFi packet time series (see Table 1). Each row represents a single WiFi packet with the complex CSI vector H being stored in the "data" field and the class label being stored in the "class" field. H is of the form [I, R, I, R, ..., I, R], where two consecutive entries represent imaginary and real parts of complex numbers (the Channel Frequency Responses of subcarriers). Taking the absolute value of H (e.g., via numpy.abs(H)) yields the subcarrier amplitudes A.

To extract the 52 L-LTF subcarriers used in [1], the following indices of A are to be selected:

# 52 L-LTF subcarriers 
csi_valid_subcarrier_index = []
csi_valid_subcarrier_index += [i for i in range(6, 32)]
csi_valid_subcarrier_index += [i for i in range(33, 59)]

Additional 56 HT-LTF subcarriers can be selected via:

# 56 HT-LTF subcarriers
csi_valid_subcarrier_index += [i for i in range(66, 94)]     
csi_valid_subcarrier_index += [i for i in range(95, 123)]

For more details on subcarrier selection, see ESP-IDF (Section Wi-Fi Channel State Information) and esp-csi.

Extracted amplitude spectrograms with the corresponding label files of the train/validation/test split: "trainLabels.csv," "validationLabels.csv," and "testLabels.csv," can be found in the spectrograms/ directory.

The columns in the label files correspond to the following: [Spectrogram index, Class label, Room label]

  • Spectrogram index: [0, ..., n]
  • Class label: [0,1,2], where 0 = "no presence", 1 = "walking", and 2 = "walking + arm-waving."
  • Room label: [0,1,2,3,4,5], where labels 1-5 correspond to the room number in the NLoS scenario (see Fig. 3 in [1]). The label 0 corresponds to no room and is used for the "no presence" class.

Dataset Overview:

Table 1: Raw WiFi packet sequences.

Scenario System "no presence" /  label 0 "walking"  / label 1 "walking + arm-waving" /  label 2 Total
LoS BQ b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv  
LoS PIFA b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv  
NLoS BQ b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv  
NLoS PIFA b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv  
    4 20 20 44

Table 2: Sample/Spectrogram distribution across activity classes in Wallhack1.8k.

Scenario System

"no presence" /  label 0

"walking"  / label 1

"walking + arm-waving" /  label 2 Total
LoS BQ 149 154 155  
LoS PIFA 149 160 152  
NLoS BQ 148 150 152  
NLoS PIFA 143 147 147  
    589 611 606 1,806

 

Download and Use
This data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to one of our papers [1,2].

[1] Strohmayer, Julian, and Martin Kampel. "Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition." IFIP International Conference on Artificial Intelligence Applications and Innovations. Cham: Springer Nature Switzerland, 2024.

[2] Strohmayer, Julian, and Martin Kampel. "Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition." arXiv preprint arXiv:2401.01388 (2024).

BibTeX citations:

@inproceedings{strohmayer2024data,
  title={Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition},
  author={Strohmayer, Julian and Kampel, Martin},
  booktitle={IFIP International Conference on Artificial Intelligence Applications and Innovations},
  pages={42--56},
  year={2024},
  organization={Springer}}

@INPROCEEDINGS{strohmayer10647666,
  author={Strohmayer, Julian and Kampel, Martin},
  booktitle={2024 IEEE International Conference on Image Processing (ICIP)}, 
  title={Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition}, 
  year={2024},
  volume={},
  number={},
pages={3594-3599},
doi={10.1109/ICIP51287.2024.10647666}}

Files

CAD.zip

Files (136.4 MB)

Name Size Download all
md5:f706a5dd729daf5d8b498f713f8a538d
1.2 MB Preview Download
md5:57a0b9baa3532e1172b25d756536071a
135.1 MB Preview Download

Additional details

Related works

Is published in
Conference paper: 10.1007/978-3-031-63211-2_4 (DOI)
Conference paper: 10.1109/ICIP51287.2024.10647666 (DOI)

Dates

Available
2024-06-27

Software

Repository URL
https://github.com/StrohmayerJ/wallhack1.8k
Programming language
Python