{
  "DOI": "10.5281/zenodo.3882104",
  "abstract": "EyeFi Dataset\n\n\nThis dataset is collected as a part of the EyeFi project at Bosch Research and Technology Center, Pittsburgh, PA, USA. The dataset contains WiFi CSI values of human motion trajectories along with ground truth location information captured through a camera. This\u00a0dataset is used in the following paper \"EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching\" that is published in the IEEE International Conference on Distributed Computing in Sensor Systems 2020 (DCOSS '20). We also published a dataset paper titled as \"Dataset: Person Tracking and Identification using Cameras and Wi-Fi Channel State Information (CSI) from Smartphones\"\u00a0in Data: Acquisition to Analysis 2020 (DATA '20) workshop describing details of data collection. Please check it out for more information on the dataset.\n\n\nClarification/Bug report: Please note that the order of antennas and subcarriers in .h5 files is not written clearly in the README.md file. The order of antennas and subcarriers are as follows for the 90 `csi_real` and `csi_imag` values : [subcarrier1-antenna1, subcarrier1-antenna2, subcarrier1-antenna3, subcarrier2-antenna1, subcarrier2-antenna2, subcarrier2-antenna3,\u2026 subcarrier30-antenna1, subcarrier30-antenna2, subcarrier30-antenna3]. Please see the description below. The newer version of the dataset contains this information in README.md. We are sorry for the inconvenience.\n\n\nData Collection Setup\n\nIn our experiments, we used Intel 5300 WiFi Network Interface Card (NIC) installed in an Intel NUC and Linux CSI tools [1] to extract the WiFi CSI packets. The (x,y) coordinates of the subjects are collected from Bosch Flexidome IP Panoramic 7000 panoramic camera mounted on the ceiling and Angle of Arrivals (AoAs) are derived from the (x,y) coordinates. Both the WiFi card and camera are located at the same origin coordinates but at different height, the camera is location around 2.85m from the ground and WiFi antennas are around 1.12m above the ground.\n\n\nThe data collection environment consists of two areas, first one is a\u00a0rectangular space measured 11.8m x 8.74m, and the second space is an irregularly shaped kitchen area with maximum distances of 19.74m and 14.24m between two walls. The kitchen also has numerous obstacles and different materials that pose different RF reflection characteristics\u00a0including strong reflectors such as metal refrigerators and dishwashers.\u00a0\n\n\nTo collect the WiFi data, we used a Google Pixel 2 XL smartphone as an access point and connect the Intel 5300 NIC to it for WiFi communication. The transmission rate is about 20-25 packets per second. The same WiFi card and phone are used in both lab and kitchen area.\n\n\nList of Files\nHere is a list of files included in the dataset:\n\n\n|- 1_person\n\u00a0 \u00a0 |- 1_person_1.h5\n\u00a0 \u00a0 |- 1_person_2.h5\n|- 2_people\n\u00a0 \u00a0 |- 2_people_1.h5\n\u00a0 \u00a0 |- 2_people_2.h5\n\u00a0 \u00a0 |- 2_people_3.h5\n|- 3_people\n\u00a0 \u00a0 |- 3_people_1.h5\n\u00a0 \u00a0 |- 3_people_2.h5\n\u00a0 \u00a0 |- 3_people_3.h5\n|- 5_people\n\u00a0 \u00a0 |- 5_people_1.h5\n\u00a0 \u00a0 |- 5_people_2.h5\n\u00a0 \u00a0 |- 5_people_3.h5\n\u00a0 \u00a0 |- 5_people_4.h5\n|- 10_people\n\u00a0 \u00a0 |- 10_people_1.h5\n\u00a0 \u00a0 |- 10_people_2.h5\n\u00a0 \u00a0 |- 10_people_3.h5\n|- Kitchen\n\u00a0 \u00a0 |- 1_person\n\u00a0 \u00a0 \u00a0 \u00a0 |- kitchen_1_person_1.h5\n\u00a0 \u00a0 \u00a0 \u00a0 |- kitchen_1_person_2.h5\n\u00a0 \u00a0 \u00a0 \u00a0 |- kitchen_1_person_3.h5\n\u00a0 \u00a0 |- 3_people\n\u00a0 \u00a0 \u00a0 \u00a0 |- kitchen_3_people_1.h5\n|- training\n\u00a0 \u00a0 |- shuffuled_train.h5\n\u00a0 \u00a0 |- shuffuled_valid.h5\n\u00a0 \u00a0 |- shuffuled_test.h5\nView-Dataset-Example.ipynb\nREADME.md\n\n\n\n\nIn this dataset, folder `1_person/` , `2_people/` , `3_people/` , `5_people/`, and `10_people/` contains data collected from the lab area whereas `Kitchen/` folder contains data collected from the kitchen area. To see how the each file is structured, please see below in section Access the data.\u00a0\n\n\nThe training folder contains the training dataset we used to train the neural network discussed in our paper. They are generated by shuffling all the data from `1_person/` folder collected in the lab area (`1_person_1.h5` and `1_person_2.h5`).\u00a0\n\n\nWhy multiple files in one folder?\n\n\nEach folder contains multiple files. For example, `1_person` folder has two files: `1_person_1.h5` and `1_person_2.h5`. Files in the same folder always have the same number of human subjects present simultaneously in the scene. However, the person who is holding the phone can be different. Also, the data could be collected through different days and/or the data collection system needs to be rebooted due to stability issue. As result, we provided different files (like `1_person_1.h5`, `1_person_2.h5`) to distinguish different person who is holding the phone and possible system reboot that introduces different phase offsets (see below) in the system.\u00a0\n\n\nSpecial note:\n\n\nFor `1_person_1.h5`, this file is generated by the same person who is holding the phone, and `1_person_2.h5` contains different people holding the phone but only one person is present in the area at a time. Boths files are collected in different days as well.\n\n\n\nAccess the data\nTo access the data, hdf5 library is needed to open the dataset. There are free HDF5 viewer available on the official website: https://www.hdfgroup.org/downloads/hdfview/. We also provide an example Python code View-Dataset-Example.ipynb to demonstrate how to access the data.\n\n\nEach file is structured as (except the files under *\"training/\"* folder):\n\u00a0\n\n\n|- csi_imag\n|- csi_real\n|- nPaths_1\n\u00a0 \u00a0 |- offset_00\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_11\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_12\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_21\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_22\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n|- nPaths_2\n\u00a0 \u00a0 |- offset_00\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_11\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_12\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_21\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_22\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n|- nPaths_3\n\u00a0 \u00a0 |- offset_00\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_11\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_12\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_21\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_22\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n|- nPaths_4\n\u00a0 \u00a0 |- offset_00\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_11\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_12\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_21\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n\u00a0 \u00a0 |- offset_22\n\u00a0 \u00a0 \u00a0 \u00a0 |- spotfi_aoa\n|- num_obj\n|- obj_0\n\u00a0 \u00a0 |- cam_aoa\n\u00a0 \u00a0 |- coordinates\n|- obj_1\n\u00a0 \u00a0 |- cam_aoa\n\u00a0 \u00a0 |- coordinates\n...\n|- timestamp\n\n\n\nThe `csi_real` and `csi_imag` are the real and imagenary part of the CSI measurements. The order of antennas and subcarriers are as follows for the 90 `csi_real` and `csi_imag` values : [subcarrier1-antenna1, subcarrier1-antenna2, subcarrier1-antenna3, subcarrier2-antenna1, subcarrier2-antenna2, subcarrier2-antenna3,\u2026 subcarrier30-antenna1, subcarrier30-antenna2, subcarrier30-antenna3].\u00a0`nPaths_x` group are SpotFi [2] calculated WiFi Angle of Arrival (AoA) with `x` number of multiple paths specified during calculation. Under the `nPath_x` group are `offset_xx` subgroup where `xx` stands for the offset combination used to correct the phase offset during the SpotFi calculation. We measured the offsets as:\n\n\n|Antennas | Offset 1 (rad) | Offset 2 (rad) |\n|:-------:|:---------------:|:-------------:|\n|  1 & 2  |     1.1899      |     -2.0071\n|  1 & 3  |     1.3883      |     -1.8129\n\n\n\n\nThe measurement is based on the work [3], where the authors state there are two possible offsets between two antennas which we measured by booting the device multiple times. The combination of the offset are used for the `offset_xx` naming. For example, `offset_12` is offset 1 between antenna \u00a01 & 2 and offset 2 between antenna 1 & 3 are used in the SpotFi calculation.\n\n\nThe `num_obj` field is used to store the number of human subjects present in the scene. The `obj_0` is always the subject who is holding the phone. In each file, there are `num_obj` of `obj_x`. For each `obj_x1`, we have the `coordinates` reported from the camera and `cam_aoa`, which is estimated AoA from the camera reported coordinates. The (x,y) coordinates and AoA listed here are chronologically ordered (except the files in the `training` folder) . It reflects the way the person carried the phone moved in the space (for `obj_0`) and everyone else walked (for other `obj_y`, where `y` > 0).\u00a0\n\n\nThe `timestamp` is provided here for time reference for each WiFi packets.\n\n\nTo access the data (Python):\n\n\nimport h5py\n\ndata = h5py.File('3_people_3.h5','r')\n\ncsi_real = data['csi_real'][()]\ncsi_imag = data['csi_imag'][()]\n\ncam_aoa = data['obj_0/cam_aoa'][()]\u00a0\ncam_loc = data['obj_0/coordinates'][()]\u00a0\n\n\n\nFor file inside `training/` folder:\n\n\nFiles inside training folder has a different data structure:\n\n\n\n|- nPath-1\n\u00a0 \u00a0 |- aoa\n\u00a0 \u00a0 |- csi_imag\n\u00a0 \u00a0 |- csi_real\n\u00a0 \u00a0 |- spotfi\n|- nPath-2\n\u00a0 \u00a0 |- aoa\n\u00a0 \u00a0 |- csi_imag\n\u00a0 \u00a0 |- csi_real\n\u00a0 \u00a0 |- spotfi\n|- nPath-3\n\u00a0 \u00a0 |- aoa\n\u00a0 \u00a0 |- csi_imag\n\u00a0 \u00a0 |- csi_real\n\u00a0 \u00a0 |- spotfi\n|- nPath-4\n\u00a0 \u00a0 |- aoa\n\u00a0 \u00a0 |- csi_imag\n\u00a0 \u00a0 |- csi_real\n\u00a0 \u00a0 |- spotfi\n\n\n\n\nThe group `nPath-x` is the number of multiple path specified during the SpotFi calculation. `aoa` is the camera generated angle of arrival (AoA) (can be considered as ground truth), `csi_image` and `csi_real` is the imaginary and real component of the CSI value. `spotfi` is the SpotFi calculated AoA values. The SpotFi values are chosen based on the lowest median and mean error from across `1_person_1.h5` and `1_person_2.h5`. All the rows under the same `nPath-x` group are aligned (i.e., first row of `aoa` corresponds to the first row of `csi_imag`, `csi_real`, and `spotfi`. There is no timestamp recorded and the sequence of the data is not chronological as they are randomly shuffled from the `1_person_1.h5` and `1_person_2.h5` files.\u00a0\n\n\nCitation\nIf you use the dataset, please cite our paper:\n\n\n@inproceedings{eyefi2020,\n\u00a0 title={EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching},\n\u00a0 author={Fang, Shiwei and Islam, Tamzeed and Munir, Sirajum and Nirjon, Shahriar},\n\u00a0 booktitle={2020 IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS)},\n\u00a0 year={2020},\n\u00a0 organization={IEEE}\n}\n\n\n\nThanks!\n\n\nReferences\n\n\n1. Halperin, Daniel, et al. \"Tool release: Gathering 802.11 n traces with channel state information.\" ACM SIGCOMM Computer Communication Review 41.1 (2011): 53-53.\n\n\n2. Kotaru, Manikanta, et al. \"Spotfi: Decimeter level localization using wifi.\" Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. 2015.\n\n\n3. Zhang, Dongheng, et al. \"Calibrating Phase Offsets for Commodity WiFi.\" IEEE Systems Journal (2019).\n\u00a0",
  "author": [
    {
      "family": "Shiwei Fang"
    },
    {
      "family": "Tamzeed Islam"
    },
    {
      "family": "Sirajum Munir"
    },
    {
      "family": "Shahriar Nirjon"
    }
  ],
  "event": "IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS)",
  "id": "3882104",
  "issued": {
    "date-parts": [
      [
        "2020",
        "06",
        "06"
      ]
    ]
  },
  "language": "eng",
  "publisher": "Zenodo",
  "title": "EyeFi: Fast Human Identification Through Vision and WiFi-based Trajectory Matching",
  "type": "dataset",
  "version": "1.0.0"
}