Published September 1, 2025 | Version 1.0.0
Dataset Open

Air Quality Data from Regulatory AQMS and Low-Cost Sensors in Bulevar Sur, Valencia (July 08 - November 18, 2023)

Description

This dataset contains parallel time-series data for air quality and meteorological parameters, collected for the purpose of calibrating a low-cost air quality sensor (LCS) against a regulatory-grade reference monitoring station. The data was collected continuously for 165 days, from June 8, 2023, to November 20, 2023, at the Bulevar Sur air quality station in Valencia, Spain.

The dataset is divided into two main sources:

  1. Reference Data (GVA): The gva_*.csv files originate from an official Valencian AQ Monitoring Network (VAQMN) station, managed by the Generalitat Valenciana (GVA). This data represents the high-accuracy "ground truth" measurements from professional, regulatory-grade instruments.

The selected station is located at (lat: 39.45037852, lon: -0.39631399), and is identified by its code 46250050

  1. Low-Cost Sensor Data (LCS): The lcs_*.csv files contain raw data collected by a custom-built IoT node equipped with a ZPHS01B multi-sensor module. This node was co-located with the official GVA station to ensure measurements were taken under identical ambient conditions.

The raw sensor data has been processed and aggregated into synchronized time intervals of 10, 30, and 60 minutes to facilitate direct comparison and the training of machine learning models. This dataset was used to develop and evaluate machine learning algorithms to improve the accuracy of raw ozone (O3) readings from the low-cost sensor, as detailed in the related publication.

Technical info

Files Description

  • gva_10.csv, gva_30.csv, gva_60.csv: Data from the official GVA reference station, aggregated at 10, 30, and 60-minute intervals, respectively.

  • lcs_10.csv, lcs_30.csv, lcs_60.csv: Data from the low-cost ZPHS01B sensor, aggregated at 10, 30, and 60-minute intervals, respectively.

Variable Descriptions

GVA Files (gva_*.csv)

  • NO: Nitric Oxide concentration (µg/m³)

  • NO2: Nitrogen Dioxide concentration (µg/m³)

  • NOX: Total Nitrogen Oxides concentration (µg/m³)

  • SO2: Sulfur Dioxide concentration (µg/m³)

  • O3: Ozone concentration (µg/m³) - This is the primary reference variable.

  • PM10_S/C: Particulate Matter < 10 µm (µg/m³)*

  • PM10: Particulate Matter < 10 µm (µg/m³)*

  • Temp: Ambient Temperature (°C)

  • Hum: Relative Humidity (%)

  • NH3: Ammonia concentration (µg/m³)*

  • NO_ECO, NO2_ECO: Readings from an electrochemical sensor within the station.*

  • rounded_datetime: Unix timestamp in nanoseconds, marking the start of the measurement interval.

* No values for this AQMS in this time period

LCS Files (lcs_*.csv)

  • Temp: Ambient Temperature (°C)

  • Hum: Relative Humidity (%)

  • PM1: Particulate Matter < 1 µm (µg/m³)

  • PM2_5: Particulate Matter < 2.5 µm (µg/m³)

  • PM10: Particulate Matter < 10 µm (µg/m³)

  • VOC: Volatile Organic Compounds (level-based)

  • CH20: Formaldehyde concentration (mg/m³)

  • CO2: Carbon Dioxide concentration (ppm)

  • CO: Carbon Monoxide concentration (ppm)

  • O3: Raw Ozone concentration from the LCS (µg/m³, converted from ppm for the study) - This is the primary variable to be calibrated.

  • NO2: Raw Nitrogen Dioxide concentration from the LCS (ppm)

  • rounded_datetime: Unix timestamp in nanoseconds, marking the start of the measurement interval.

 

Data Quality and Missing Values

Users should be aware that some files and variables within this dataset contain missing values.

  •  GVA Data (`gva_*.csv`): Several columns in the GVA files, such as PM10_S/C, PM10, NH3, NO_ECO, and NO2_ECO, have no values for this AQMS in this time period.
  • LCS Data (`lcs_*.csv`): The low-cost sensor data is generally more complete, but intermittent sensor or communication failures may have resulted in occasional missing rows. 

 

We recommend that users perform an initial check for missing data and implement an appropriate handling strategy before analysis.

Files

bulevar-sur-dataset.zip

Files (2.1 MB)

Name Size Download all
md5:37c486db7953b813d2f5797fd74e1ff2
2.1 MB Preview Download

Additional details

Related works

Is supplement to
Journal article: 10.5194/amt-2024-127 (DOI)

Funding

Ministerio de Ciencia, Innovación y Universidades
PID2021-126823OB-I00
Ministerio de Ciencia, Innovación y Universidades
TED2021-131040B-C33
Generalitat Valenciana
CIAICO/2022/179
Generalitat Valenciana
CIAEST/2022/64
Generalitat Valenciana
CIACIF/2023/416
Generalitat Valenciana
CIAEST/2024/71
Ministerio de Educación y Formación Profesional
PRX23/00589

Dates

Created
2023-06-08/2023-11-18