Wallhack1.8k Dataset | Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition
Description
This repository contains the Wallhack1.8k dataset for WiFi-based long-range activity recognition in Line-of-Sight (LoS) and Non-Line-of-Sight (NLoS)/Through-Wall scenarios, as proposed in [1,2], as well as the CAD models (of 3D-printable parts) of the WiFi systems proposed in [2].
PyTroch Dataloader
A minimal PyTorch dataloader for the Wallhack1.8k dataset is provided at: https://github.com/StrohmayerJ/wallhack1.8k
Dataset Description
The Wallhack1.8k dataset comprises 1,806 CSI amplitude spectrograms (and raw WiFi packet time series) corresponding to three activity classes: "no presence," "walking," and "walking + arm-waving." WiFi packets were transmitted at a frequency of 100 Hz, and each spectrogram captures a temporal context of approximately 4 seconds (400 WiFi packets).
To assess cross-scenario and cross-system generalization, WiFi packet sequences were collected in LoS and through-wall (NLoS) scenarios, utilizing two different WiFi systems (BQ: biquad antenna and PIFA: printed inverted-F antenna). The dataset is structured accordingly:
- LOS/BQ/ <- WiFi packets collected in the LoS scenario using the BQ system
- LOS/PIFA/ <- WiFi packets collected in the LoS scenario using the PIFA system
- NLOS/BQ/ <- WiFi packets collected in the NLoS scenario using the BQ system
- NLOS/PIFA/ <- WiFi packets collected in the NLoS scenario using the PIFA system
These directories contain the raw WiFi packet time series (see Table 1). Each row represents a single WiFi packet with the complex CSI vector H being stored in the "data" field and the class label being stored in the "class" field. H is of the form [I, R, I, R, ..., I, R], where two consecutive entries represent imaginary and real parts of complex numbers (the Channel Frequency Responses of subcarriers). Taking the absolute value of H (e.g., via numpy.abs(H)) yields the subcarrier amplitudes A.
To extract the 52 L-LTF subcarriers used in [1], the following indices of A are to be selected:
# 52 L-LTF subcarriers
csi_valid_subcarrier_index = []
csi_valid_subcarrier_index += [i for i in range(6, 32)]
csi_valid_subcarrier_index += [i for i in range(33, 59)]
Additional 56 HT-LTF subcarriers can be selected via:
# 56 HT-LTF subcarriers
csi_valid_subcarrier_index += [i for i in range(66, 94)]
csi_valid_subcarrier_index += [i for i in range(95, 123)]
For more details on subcarrier selection, see ESP-IDF (Section Wi-Fi Channel State Information) and esp-csi.
Extracted amplitude spectrograms with the corresponding label files of the train/validation/test split: "trainLabels.csv," "validationLabels.csv," and "testLabels.csv," can be found in the spectrograms/ directory.
The columns in the label files correspond to the following: [Spectrogram index, Class label, Room label]
- Spectrogram index: [0, ..., n]
- Class label: [0,1,2], where 0 = "no presence", 1 = "walking", and 2 = "walking + arm-waving."
- Room label: [0,1,2,3,4,5], where labels 1-5 correspond to the room number in the NLoS scenario (see Fig. 3 in [1]). The label 0 corresponds to no room and is used for the "no presence" class.
Dataset Overview:
Table 1: Raw WiFi packet sequences.
Scenario | System | "no presence" / label 0 | "walking" / label 1 | "walking + arm-waving" / label 2 | Total |
LoS | BQ | b1.csv | w1.csv, w2.csv, w3.csv, w4.csv and w5.csv | ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv | |
LoS | PIFA | b1.csv | w1.csv, w2.csv, w3.csv, w4.csv and w5.csv | ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv | |
NLoS | BQ | b1.csv | w1.csv, w2.csv, w3.csv, w4.csv and w5.csv | ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv | |
NLoS | PIFA | b1.csv | w1.csv, w2.csv, w3.csv, w4.csv and w5.csv | ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv | |
4 | 20 | 20 | 44 |
Table 2: Sample/Spectrogram distribution across activity classes in Wallhack1.8k.
Scenario | System |
"no presence" / label 0 |
"walking" / label 1 |
"walking + arm-waving" / label 2 | Total |
LoS | BQ | 149 | 154 | 155 | |
LoS | PIFA | 149 | 160 | 152 | |
NLoS | BQ | 148 | 150 | 152 | |
NLoS | PIFA | 143 | 147 | 147 | |
589 | 611 | 606 | 1,806 |
Download and Use
This data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to one of our papers [1,2].
[1] Strohmayer, Julian, and Martin Kampel. "Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition." IFIP International Conference on Artificial Intelligence Applications and Innovations. Cham: Springer Nature Switzerland, 2024.
[2] Strohmayer, Julian, and Martin Kampel. "Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition." arXiv preprint arXiv:2401.01388 (2024).
BibTeX citations:
@inproceedings{strohmayer2024data, title={Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition}, author={Strohmayer, Julian and Kampel, Martin}, booktitle={IFIP International Conference on Artificial Intelligence Applications and Innovations}, pages={42--56}, year={2024}, organization={Springer}}
@INPROCEEDINGS{strohmayer10647666,
author={Strohmayer, Julian and Kampel, Martin},
booktitle={2024 IEEE International Conference on Image Processing (ICIP)},
title={Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition},
year={2024},
volume={},
number={},
pages={3594-3599},
doi={10.1109/ICIP51287.2024.10647666}}
Files
CAD.zip
Files
(136.4 MB)
Name | Size | Download all |
---|---|---|
md5:f706a5dd729daf5d8b498f713f8a538d
|
1.2 MB | Preview Download |
md5:57a0b9baa3532e1172b25d756536071a
|
135.1 MB | Preview Download |
Additional details
Related works
- Is published in
- Conference paper: 10.1007/978-3-031-63211-2_4 (DOI)
- Conference paper: 10.1109/ICIP51287.2024.10647666 (DOI)
Dates
- Available
-
2024-06-27
Software
- Repository URL
- https://github.com/StrohmayerJ/wallhack1.8k
- Programming language
- Python