############################################################################################ The S&M-HSTPM2d5 dataset: High Spatial-Temporal Resolution PM 2.5 Measures in Multiple Cities Sensed by Static & Mobile Devices, Version 1.0.0 ############################################################################################ This Read_Me.txt file provides a brief description of the S&M-HSTPM2d5 dataset and instructions to access it. By deploying static and mobile devices in three Chinese cities (Foshan, Cangzhou, Tianjin), real PM2.5 measures are collected in each city with high spatial and temporal resolution. S&M-HSTPM2d5 dataset contains the measures of PM2.5 concentration, timestamps, and GPS records of static and mobile devices. The dataset is available at: https://10.5281/zenodo.4028130 The data descriptor paper with details of data collection and cleaning process is under submission. For proper citation of the manuscript, please refer to the latest version of this dataset which includes the details. This dataset and its descriptor paper were created by: Xinlei Chen, Xinyu Liu, Kent Xin Kin Eng, Jingxiao Liu, Hae Young Noh, Lin Zhang, and Pei Zhang For questions or suggestions please e-mail Xinlei Chen ============================================================================================ Description ============================================================================================ This S&M-HSTPM2d5 dataset contains the high spatial and temporal resolution of the PM2.5 measures with the corresponding timestamp and GPS location of mobile and static devices in the three Chinese cities: Foshan, Cangzhou, and Tianjin. Different numbers of static and mobile devices were set up in each city. The sampling rate was set up as one minute in Cangzhou, and three seconds in Foshan and Tianjin. For the specific detail of the setup, please refer to the Device_Setup_Description.txt file in this repository and the data descriptor paper. After the data collection process, the data cleaning process was performed to remove and adjust the abnormal and drifting data. The script of the data cleaning algorithm is provided in this repository. The data cleaning algorithm only adjusts or removes individual data points. The removal of the entire device's data was done after the data cleaning algorithm with empirical judgment and graphic visualization. For specific detail of the data cleaning process, please refer to the script (Data_cleaning_algorithm.ipynb) in this repository and the data descriptor paper. The dataset in this repository is the processed version. The raw dataset and removed devices are not included in this repository. The data is stored as a CSV file. Each CSV file which is named by the device ID represents the data that was collected by the corresponding device. Each CSV file has three types of data: timestamp as the China Standard Time (GMT+8), geographic location as latitude and longitude, and PM2.5 concentration with the unit of microgram per cubic meter. The CSV files are stored in either Static or Mobile folder which represents the devices' type. The Static and Mobile folder are stored in the corresponding city's folder. To access the dataset, any programming language that can access CSV files is appropriate. Users can also open the CSV file directly. The get_dataset.ipynb file in this repository also provides an option of accessing the dataset. To successfully execute ipynb file, Jupyter Notebook with Python 3.0 is required. The following python library is also required: get_dataset.ipynb: 1. os library 2. pandas library Data_cleaning_algorithm.ipynb: 1. os library 2. pandas library 3. datetime library 4. math library The instruction of installing the libraries above can be found online. After installing the Jupyter Notebook with Python 3.0 and the required libraries, users can try to open the ipynb file with Jupyter Notebook and follow the instruction inside the file. ============================================================================================ File list ============================================================================================ Please first unzip the /S&M-HSTPM2d5/Cangzhou.zip /S&M-HSTPM2d5/Foshan.zip /S&M-HSTPM2d5/Tianjin.zip /S&M-HSTPM2d5/ Read_Me.txt This file Device_Setup_Description.txt Description of the setup of individual device in each city Data_cleaning_algorithm.ipnyb Script that is used in data cleaning process get_dataset.ipnyb Script for accessing the dataset Cangzhou/ Static/ id_1.csv Individal static device's id_2.csv data in Cangzhou. Each csv ... file represent one device Mobile/ id_1.csv Individal mobile device's id_2.csv data in Cangzhou. Each csv ... file represent one device Foshan/ Static/ id_1.csv Individal static device's id_2.csv data in Foshan. Each csv ... file represent one device Mobile/ id_1.csv Individal mobile device's id_2.csv data in Foshan. Each csv ... file represent one device Tianjin/ Static/ id_1.csv Individal static device's id_2.csv data in Tianjin. Each csv ... file represent one device Mobile/ id_1.csv Individal mobile device's id_2.csv data in Tianjin. Each csv ... file represent one device