Published April 14, 2025 | Version v1
Dataset Open

High-Resolution Dataset of Time-Integrated 3D Urban Pollutant Concentration Fields

  • 1. ROR icon CEA DAM Île-de-France
  • 2. ROR icon CEA LETI
  • 3. ROR icon IRT Saint Exupéry

Description

📄 Dataset Description

This dataset contains synthetic, time-integrated 3D concentration fields of an air pollutant in urban environments, generated using the PMSS (Parallel Micro Swift Spray) atmospheric dispersion modeling system. The data are provided as NetCDF4 files, representing pollutant concentration fields integrated over a two-hour period for various atmospheric and emission source scenarios. 

The dataset totals approximately 165 GB of compressed data and includes multiple simulation instances, each corresponding to a specific initialization defined by:

  • A hypothetical point source emitting a unit mass of pollutant (gas or particulate matter) at 2 m height

  • One of 108 stationary weather conditions, derived from combinations of wind directions (θ ∈ {0°, 10°, ..., 350°}) and wind speeds (v ∈ {1.5, 3.5, 6} m/s)

  • Urban geometries based on two cities: Grenoble and Paris, each provided as a separate compressed folder.

 

📦 Folder Structure

The dataset is distributed in split archive format due to its large size, with the following compressed folders:

  • raw_paris.tar.gz.part_aa ... raw_paris.tar.gz.part_af➡ produce raw_paris.tar.gz once reconstructed
  • raw_grenoble.tar.gz.part_aa ... raw_grenoble.tar.gz.part_ak ➡ produce raw_grenoble.tar.gz once reconstructed

Each archive, once reconstructed and extracted, contains:

🔹 Integrated concentration fields (.nc)

Time-integrated 3D concentration fields simulated for each scenario. File naming convention:

 
concint_DXXXUYYY_PTZZZ.nc 
  • DXXX: Wind direction in degrees

  • UYYY: Wind speed in m/s

  • PTZZZ: Emission source reference index

🔹 Source coordinates

  • PointsRejets_rel.csv: CSV file listing the relative X and Y coordinates of the emission sources. Each row corresponds to a unique emission source index.

  • Reference coordinate systems:

    • Grenoble: (912000, 6456000)

    • Paris: (649000, 6862000)

  • Emission height is constant across all experiments.

⚠️ Note for Grenoble:

  • An additional file, PointsRejets_rel_new.csv, is provided and should be used only for simulations where the wind direction DXXX is such that DXXX / 10 is odd (i.e., DXXX ∈ {10, 30, ..., 270}).

  • While the emission source locations remain unchanged, the indices in PointsRejets_rel_new.csv differ from those in PointsRejets_rel.csv.

🔹 Urban building maps (indice_building_grenoble.nc / indice_building_paris.nc)

NetCDF files encoding the 3D urban environment as a matrix:

  • A value of 0 indicates no obstacle

  • A positive value z represents the height of an obstacle

🖥️ Extraction Instructions

The archives are split for upload convenience. You must **recombine and extract them** before using the dataset.

🐧 On Linux/macOS:


cat raw_paris.tar.gz.part_* > raw_paris.tar.gz
tar -xvzf raw_paris.tar.gz

cat raw_grenoble.tar.gz.part_* > raw_grenoble.tar.gz
tar -xvzf raw_grenoble.tar.gz

 

🪟 On Windows (PowerShell):


Get-Content raw_paris.tar.gz.part_* -Encoding Byte -ReadCount 0 | Set-Content -Encoding Byte raw_paris.tar.gz
tar -xvzf raw_paris.tar.gz

Get-Content raw_grenoble.tar.gz.part_* -Encoding Byte -ReadCount 0 | Set-Content -Encoding Byte raw_grenoble.tar.gz
tar -xvzf raw_grenoble.tar.gz

 

🧠 Technical Details

The simulations were conducted using high-performance computing (HPC) resources with the PMSS modeling system, developed by CEA and ARIA Technologies. The model computes:

  • Steady-state 3D wind fields

  • Transient 3D pollutant dispersion fields

The computational domain is defined with a high-resolution mesh:

  • 2 m horizontal and vertical resolution up to building height, over a ~1 km² surface area

  • Progressively coarsening vertical resolution above rooftops, while maintaining 2 m horizontal resolution

 

📚 Applications

This dataset is suitable for:

  • Training and benchmarking scientific machine learning models

  • Studying urban pollutant dispersion

  • Evaluating model accuracy in atmospheric physics and air quality modeling

Files

Files (164.5 GB)

Name Size Download all
md5:d044e90b5f22f34b86bf6ead580dd7bd
10.5 GB Download
md5:86c9683243750c3593e6d1634a3010b9
10.5 GB Download
md5:bac7c7a72979fd4edbbc9d01c7cb8249
10.5 GB Download
md5:f80705ac3b0390ff840d8c8d4293cae4
10.5 GB Download
md5:92c4a68d0064ef810f3bd296cf33e8b9
10.5 GB Download
md5:eb2621b8e85644ce838720aca5c1faec
10.5 GB Download
md5:550d2320a649bc8655878f7516f5ce0e
10.5 GB Download
md5:e40a0d77aab89bbd72aefad4832721cb
10.5 GB Download
md5:b5f794b6fddc0de58965bb6962fbfd09
10.5 GB Download
md5:2a379f3d97a3994b413216cfac2aa25c
10.5 GB Download
md5:f005a941ab1f893d7dccf90c5aec9b9d
4.3 GB Download
md5:aecf55aec206bb8ac707dfd49823ebd5
10.5 GB Download
md5:5792160e296b1de05a24516360a1ddc9
10.5 GB Download
md5:5cd47801a14fcff0a75e5398e9add07f
10.5 GB Download
md5:669caf85b1356c24efe89ae058e087ce
10.5 GB Download
md5:f4ed6dcc13c93483309fb7a08bcecaee
10.5 GB Download
md5:79c68c5ea6c5c1bb7a126369a6dbbf3e
3.0 GB Download