Dataset: Environmental Impact on the Long-Term Connectivityand Link Quality of an Outdoor LoRa Network

This repository contains the long-term connectivity and link quality dataset collected on ChirpBox over 4 months (May -- September 2021) in the city of Shanghai, China.

In addition to the dataset itself, we provide evaluation scripts for data analysis and visualization, in order to facilitate data exploration and re-use. To make it clear how to use the scripts, we provide a Jupyter notebook -- dataset.ipynb for dataset visualization.

List of files:

  1. dataset_03052021_15092021.csv The dataset includes LoRa connectivity and link quality, as well as environmental information, collected from May 3 to September 15, 2021.
  2. data_analysis.py The script for dataset analysis and visualization. One can use the functions in this script to derive network-level statistics (e.g., in terms of average number of correctly-exchanged packets), link-level statistics (e.g., in terms of SNR, RSS, and PRR), and node-level statistics(e.g., in terms of number of neighbours and temperature evolution over time).
  3. metadata_processing.py The script for pre-processing metadata into CSV files. One can use the functions in this script to convert metadata for each measurement saved in TXT and JSON formats to CSV files that include attributes such as link quality, connectivity, and environmental information, an example of which is dataset_03052021_15092021.csv.
  4. dataset.ipynb The Jupiter notebook contains examples of visualization and metadata pre-processing of datasets with functions in data_analysis.py and metadata_processing.py.
  5. topology_map.png The node deployment map used to create topology figures. A usage example is Figure 1 shown in the notebook dataset.ipynb.
  6. dataset_metadata.zip The dataset metadata is stored in TXT and JSON formats. Among them, link quality, connectivity and on-board sensor data are stored in TXT files and weather information are stored in JOSN files.
  7. README.md The README.md explains all the files in this repository and gives some examples of how to use the provided scripts to analyze the dataset.

Access the dataset

We would like to show some steps and examples of using the provided scripts to visualize the dataset and pre-process the metadata. All example codes can be found and run directly in the Jupyter notebook: dataset.ipynb.

Requirements

Example codes assume that the following files and python environment are prepared:

File preparation:

  • Download all the files listed in this repository to the same directory.
  • Unzip the metadata dataset_metadata.zip into a subdirectory with the same name: dataset_metadata

Python environment:

  • Python v3.9+ (tested with 3.9.1 on Windows)
  • Python dependencies
    • networkx
    • seaborn
    • plotly
    • One can install these python packages by pip in commands:
      pip3 install networkx
      pip3 install seaborn
      pip3 install plotly
      pip3 install ipykernel
      pip3 install --upgrade nbformat
      pip3 install -U kaleido
      
      #### Examples

Dataset visualization:

  1. In order to use the functions defined in data_analysis.py for visualization, import the Analysis class:
    from data_analysis import *
    dataset_tool = Analysis() # Specify data analysis class
    
  2. Specify the figure output directory and the address of the processed CSV file:
    import pathlib # provide the working directory
    # Specify the folder for output processing
    directory_path = str(pathlib.Path.cwd()) # Current directory of notebook
    dataset_CSV = directory_path + "//dataset_03052021_15092021.csv" # TODO: name the CSV file
    
  3. Show the connectivity in the network when using SF=7 and 480 MHz as RF channel:
    id_list = list(range(21)) # 21 nodes for current dataset
    sf_list = [7] # SF = 7 and RF = 480 MHz
    freq_list = [480000]
    plot_date = ["2021-05-07 00:00:00", "2021-05-07 12:00:00"] # plot start and end time
    plot_type = ["topology", "using_pos2"] # plot type and output format
    dataset_tool.dataset_analysis(sf_list, freq_list, id_list, directory_path, dataset_CSV, plot_type, plot_date) # dataset visualization with configurations
    
  4. The output figures can be saved in PNG (default) or PDF format in the processed_plots subdirectory of the specified directory.
  5. Below are the names of all available visualization types:
    plot_type = ["topology", "using_pos0/1/2"]
    plot_type = ["max_min_temperature"]
    plot_type = ["MAX/AVG/MIN_link_RSSI_temperature_plot"]
    plot_type = ["subplot_PRR", "subplot_degree"]
    plot_type = ["heatmap"]
    

Please see more detailed examples and explanations in the Jupiter notebook: dataset.ipynb.

Dataset pre-processing:

  • If one want to process the dataset metadata, one can use the metadata processing tool to generate the same CSV file as dataset_03052021_15092021.csv. The CSV file named dataset.csv will be generated in the subdirectory dataset_metadata/dataset/ under the specified directory.
     from metadata_processing import *
     dataset_processing_tool = TXT_metadata() # Specify metadata processing class
     dataset_processing_tool.metadata_to_CSV(directory_path+"\\dataset_metadata\\", id_list)