There is a newer version of the record available.

Published December 2025 | Version pm25-full_rel-2025-11-r0_s-1.0_mb-1.0-r0
Dataset Open

South Asia PM2.5 map: India | Data

  • 1. Centre for Research on Energy and Clean Air
  • 2. ROR icon Stanford University

Description

High-resolution surface PM2.5 concentration for India, built from satellite products, surface monitors, and machine learning pipelines. This dataset is produced using a reimplementation of the Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality (Kawano, et al., 2025) built by the Centre for Research on Energy and Clean Air (CREA). See southasia-pm2-5.energyandcleanair.org for more details.

You can download daily PM2.5 data from the original authors for 1 January 2005 to 30 September 2023 from https://zenodo.org/records/13694585.

Structure of the dataset

The dataset is provided in CF-1.8–compliant NetCDF format, containing daily surface PM2.5 concentrations (μg/m3) at 10 km resolution over India. It includes time, x, and y dimensions, with the primary variable pm25(time, y, x). The data uses the WGS 84 / India NSF Lambert Conformal Conic projection (EPSG: 7755) to define spatial coordinates.

Model accuracy

We provide the results of the spatial cross-validation in a CSV file alongside the results. This has the headers r2 and rmse (in ug/m3). We use the same cross-validation metrics and methodology for daily values as for the final full model by Kawano et al. (2025).

Using the data

When using this data, please reference:

  • Kawano, Ayako, et al. "Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality." Science Advances 11.4 (2025): eadq1071. https://doi.org/10.1126/sciadv.adq1071

  • The version of the data used from Zenodo.

Technical info

Data versioning

For each release, we provide versioning info to ensure every file name encodes when it was released, which schema it follows, and which model generated it. Major versions indicate breaking or significant changes in the schema or model.

Filename pattern: pm25-<granule>_<rel>_<s>_<mb>.nc

<granule>: spatial–temporal packaging

The file’s spatial–temporal packaging.

Format: lowercase letters and hyphens only. Options available: full only

<rel>: release

Release month of the file and regeneration version. Independent of s and mb. This does not describe the maximum date available in the dataset, which is usually delayed by up to 2 months.

Format: rel-YYYY-MM-rT:

  • YYYY: four-digit year.

  • MM: two-digit month.

  • T zero-based monthly regenerate counter. Resets to 0 when YYYY-MM changes.

<s>: schema

The file’s schema.

Format: s-X.Y

  • X: major schema version. Breaking or significant change in file schema or variable layout.

  • Y: minor/patch schema version. Backward-compatible metadata or layout additions.

<mb>: model bundle

The model bundle used to generate the dataset. The models at every stage are versioned using the same number and changes to any of these can change the model bundle version. We version the features, ingestion, and model code under this version.

Format: mb-X.Y-rT

  • X: major model version: Changes to model features or model code that alters the feature set or modeling approach.

  • Y: minor model version. Backward-compatible code changes that keep the feature set identical.

  • T: zero-based retrain counter. Resets to 0 when X.Y changes.

Notes

Disclaimer: The designations employed and the presentation of the material on maps contained in this dataset do not imply the expression of any opinion whatsoever concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.

Files

pm25-full_rel-2025-11-r0_s-1.0_mb-1.0-r0.csv

Files (291.7 MB)

Name Size Download all
md5:3478b29e3691fc51d4702e4b8c1e61c2
19 Bytes Preview Download
md5:261e604ae07c75b33b3a8963e8cf1821
291.7 MB Download

Additional details

Software

Repository URL
https://github.com/energyandcleanair/pm25ml
Programming language
Python
Development Status
Active

References

  • Kawano, A., Kelp, M., Qiu, M., Singh, K., Chaturvedi, E., Dahiya, S., … Burke, M. (2025). Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality. Science Advances, 11(4), eadq1071. https://doi.org/10.1126/sciadv.adq1071
  • Kawano, A., Kelp, M., Qiu, M., Singh, K., Chaturvedi, E., DAHIYA, S., Azevedo, I., & Burke, M. (2024). High-Quality Daily PM2.5 Datasets for India at 10 km Resolution (Version 2) [Data set]. Science Advances. https://doi.org/10.5281/zenodo.13694585
  • Centre for Research on Energy and Clean Air (CREA) (n/d). pm25ml: PM2.5 Estimation for India. [Computer software] https://github.com/energyandcleanair/pm25ml (Licensed under the MIT license).
  • Central Pollution Control Board (CPCB) (n.d.). Ambient PM₂.₅ measurements from CPCB Continuous Air Quality Monitoring Stations (CAAQMS) [Data set]. CPCB, India. https://airquality.cpcb.gov.in/ccr Accessed and daily-aggregated via Centre for Research on Energy and Clean Air (CREA) API.
  • European Space Agency, Copernicus Sentinel-5P (n.d.). Sentinel-5P OFFL L3 CO: Offline Carbon Monoxide [Data set]. Google Earth Engine Data Catalog. Accessed from https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_CO
  • European Space Agency, Copernicus Sentinel-5P (n.d.). Sentinel-5P OFFL L3 NO₂: Offline nitrogen dioxide [Data set]. Google Earth Engine Data Catalog. Accessed from https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_NO2
  • Friedl, M. & Sulla-Menashe, D. (2022). MODIS/Terra + Aqua Land Cover Type Yearly L3 Global 500 m SIN Grid (MCD12Q1), Version 6.1 [Data set]. NASA Land Processes DAAC. https://doi.org/10.5067/MODIS/MCD12Q1.061 Accessed via Google Earth Engine as MODIS/061/MCD12Q1.
  • Global Modeling and Assimilation Office (GMAO). (2015). MERRA-2 inst3_3d_chm_Nv: 3d,3-Hourly, Instantaneous, Model-Level, Assimilation, Carbon Monoxide and Ozone Mixing Ratio (M2I3NVCHM) Version 5.12.4 [Data set]. NASA GES DISC. https://doi.org/10.5067/HO9OVZWF3KW2 Accessed via NASA GES DISC through Harmony Subsetter API.
  • Global Modeling and Assimilation Office (GMAO). (2015). MERRA-2 tavg1_2d_aer_Nx: 2d,1-Hourly, Time-averaged, Single-Level, Assimilation, Aerosol Diagnostics (M2T1NXAER), Version 5.12.4 [Data set]. NASA GES DISC. https://doi.org/10.5067/KLICLTZ8EM9D Accessed via NASA GES DISC through Harmony Subsetter API.
  • Kawano, A., Kelp, M., Qiu, M., Singh, K., Chaturvedi, E., Dahiya, S., Azevedo, I., & Burke, M. (2024). High-Quality Daily PM₂.₅ Datasets for India at 10 km Resolution (Version 2) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13694585
  • Krotkov, N. A., Lamsal, L. N., Marchenko, S. V., Celarier, E. A., Bucsela, E. J., Swartz, W. H., Joiner, J., & the OMI core team (2019). OMI/Aura NO2 Cloud-Screened Total and Tropospheric Column L3 Global Gridded 0.25 degree x 0.25 degree (OMNO2d), Version 003. [Data set]. NASA GES DISC. https://doi.org/10.5067/Aura/OMI/DATA3007 Accessed via NASA GES DISC through Earthdata API.
  • Lyapustin, A. & Wang, Y. (2022). MODIS/Terra+Aqua Land Aerosol Optical Depth Daily L2G Global 1km SIN Grid (MCD19A2), Version 6.1 [Data set]. NASA Land Processes DAAC. https://doi.org/10.5067/MODIS/MCD19A2.061 Accessed via Google Earth Engine as MODIS/061/MCD19A2_GRANULES.
  • Muñoz Sabater, J. (2019): ERA5-Land hourly data from 1950 to present [Data set]. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). https://doi.org/10.24381/cds.e2161bac Accessed via Google Earth Engine as ECMWF/ERA5_LAND/DAILY_AGGR.
  • NASA Jet Propulsion Laboratory. (2013). NASA Shuttle Radar Topography Mission (SRTM) Global 1 arc second, Version 3.0 (SRTMGL1 v003) [Data set]. NASA Land Processes DAAC. https://doi.org/10.5067/MEASURES/SRTM/SRTMGL1.003 Accessed via Google Earth Engine as USGS/SRTMGL1_003.