South Asia PM2.5 map: India | Data
Authors/Creators
Description
High-resolution surface PM2.5 concentration for India, built from satellite products, surface monitors, and machine learning pipelines. This dataset is produced using a reimplementation of the Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality (Kawano, et al., 2025) built by the Centre for Research on Energy and Clean Air (CREA). See southasia-pm2-5.energyandcleanair.org for more details.
You can download daily PM2.5 data from the original authors for 1 January 2005 to 30 September 2023 from https://zenodo.org/records/13694585.
Structure of the dataset
The dataset is provided in CF-1.8–compliant NetCDF format, containing daily surface PM2.5 concentrations (μg/m3) at 10 km resolution over India. It includes time, x, and y dimensions, with the primary variable pm25(time, y, x). The data uses the WGS 84 / India NSF Lambert Conformal Conic projection (EPSG: 7755) to define spatial coordinates.
Model accuracy
We provide the results of the spatial cross-validation in a CSV file alongside the results. This has the headers r2 and rmse (in ug/m3). We use the same cross-validation metrics and methodology for daily values as for the final full model by Kawano et al. (2025).
Using the data
When using this data, please reference:
-
Kawano, Ayako, et al. "Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality." Science Advances 11.4 (2025): eadq1071. https://doi.org/10.1126/sciadv.adq1071
-
The version of the data used from Zenodo.
Technical info
Data versioning
For each release, we provide versioning info to ensure every file name encodes when it was released, which schema it follows, and which model generated it. Major versions indicate breaking or significant changes in the schema or model.
Filename pattern: pm25-<granule>_<rel>_<s>_<mb>.nc
<granule>: spatial–temporal packaging
The file’s spatial–temporal packaging.
Format: lowercase letters and hyphens only. Options available: full only
<rel>: release
Release month of the file and regeneration version. Independent of s and mb. This does not describe the maximum date available in the dataset, which is usually delayed by up to 2 months.
Format: rel-YYYY-MM-rT:
-
YYYY: four-digit year. -
MM: two-digit month. -
Tzero-based monthly regenerate counter. Resets to0whenYYYY-MMchanges.
<s>: schema
The file’s schema.
Format: s-X.Y
-
X: major schema version. Breaking or significant change in file schema or variable layout. -
Y: minor/patch schema version. Backward-compatible metadata or layout additions.
<mb>: model bundle
The model bundle used to generate the dataset. The models at every stage are versioned using the same number and changes to any of these can change the model bundle version. We version the features, ingestion, and model code under this version.
Format: mb-X.Y-rT
-
X: major model version: Changes to model features or model code that alters the feature set or modeling approach. -
Y: minor model version. Backward-compatible code changes that keep the feature set identical. -
T: zero-based retrain counter. Resets to0whenX.Ychanges.
Notes
Files
pm25-full_rel-2025-11-r0_s-1.0_mb-1.0-r0.csv
Files
(291.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:3478b29e3691fc51d4702e4b8c1e61c2
|
19 Bytes | Preview Download |
|
md5:261e604ae07c75b33b3a8963e8cf1821
|
291.7 MB | Download |
Additional details
Software
- Repository URL
- https://github.com/energyandcleanair/pm25ml
- Programming language
- Python
- Development Status
- Active
References
- Kawano, A., Kelp, M., Qiu, M., Singh, K., Chaturvedi, E., Dahiya, S., … Burke, M. (2025). Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality. Science Advances, 11(4), eadq1071. https://doi.org/10.1126/sciadv.adq1071
- Kawano, A., Kelp, M., Qiu, M., Singh, K., Chaturvedi, E., DAHIYA, S., Azevedo, I., & Burke, M. (2024). High-Quality Daily PM2.5 Datasets for India at 10 km Resolution (Version 2) [Data set]. Science Advances. https://doi.org/10.5281/zenodo.13694585
- Centre for Research on Energy and Clean Air (CREA) (n/d). pm25ml: PM2.5 Estimation for India. [Computer software] https://github.com/energyandcleanair/pm25ml (Licensed under the MIT license).
- Central Pollution Control Board (CPCB) (n.d.). Ambient PM₂.₅ measurements from CPCB Continuous Air Quality Monitoring Stations (CAAQMS) [Data set]. CPCB, India. https://airquality.cpcb.gov.in/ccr Accessed and daily-aggregated via Centre for Research on Energy and Clean Air (CREA) API.
- European Space Agency, Copernicus Sentinel-5P (n.d.). Sentinel-5P OFFL L3 CO: Offline Carbon Monoxide [Data set]. Google Earth Engine Data Catalog. Accessed from https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_CO
- European Space Agency, Copernicus Sentinel-5P (n.d.). Sentinel-5P OFFL L3 NO₂: Offline nitrogen dioxide [Data set]. Google Earth Engine Data Catalog. Accessed from https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_NO2
- Friedl, M. & Sulla-Menashe, D. (2022). MODIS/Terra + Aqua Land Cover Type Yearly L3 Global 500 m SIN Grid (MCD12Q1), Version 6.1 [Data set]. NASA Land Processes DAAC. https://doi.org/10.5067/MODIS/MCD12Q1.061 Accessed via Google Earth Engine as MODIS/061/MCD12Q1.
- Global Modeling and Assimilation Office (GMAO). (2015). MERRA-2 inst3_3d_chm_Nv: 3d,3-Hourly, Instantaneous, Model-Level, Assimilation, Carbon Monoxide and Ozone Mixing Ratio (M2I3NVCHM) Version 5.12.4 [Data set]. NASA GES DISC. https://doi.org/10.5067/HO9OVZWF3KW2 Accessed via NASA GES DISC through Harmony Subsetter API.
- Global Modeling and Assimilation Office (GMAO). (2015). MERRA-2 tavg1_2d_aer_Nx: 2d,1-Hourly, Time-averaged, Single-Level, Assimilation, Aerosol Diagnostics (M2T1NXAER), Version 5.12.4 [Data set]. NASA GES DISC. https://doi.org/10.5067/KLICLTZ8EM9D Accessed via NASA GES DISC through Harmony Subsetter API.
- Kawano, A., Kelp, M., Qiu, M., Singh, K., Chaturvedi, E., Dahiya, S., Azevedo, I., & Burke, M. (2024). High-Quality Daily PM₂.₅ Datasets for India at 10 km Resolution (Version 2) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13694585
- Krotkov, N. A., Lamsal, L. N., Marchenko, S. V., Celarier, E. A., Bucsela, E. J., Swartz, W. H., Joiner, J., & the OMI core team (2019). OMI/Aura NO2 Cloud-Screened Total and Tropospheric Column L3 Global Gridded 0.25 degree x 0.25 degree (OMNO2d), Version 003. [Data set]. NASA GES DISC. https://doi.org/10.5067/Aura/OMI/DATA3007 Accessed via NASA GES DISC through Earthdata API.
- Lyapustin, A. & Wang, Y. (2022). MODIS/Terra+Aqua Land Aerosol Optical Depth Daily L2G Global 1km SIN Grid (MCD19A2), Version 6.1 [Data set]. NASA Land Processes DAAC. https://doi.org/10.5067/MODIS/MCD19A2.061 Accessed via Google Earth Engine as MODIS/061/MCD19A2_GRANULES.
- Muñoz Sabater, J. (2019): ERA5-Land hourly data from 1950 to present [Data set]. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). https://doi.org/10.24381/cds.e2161bac Accessed via Google Earth Engine as ECMWF/ERA5_LAND/DAILY_AGGR.
- NASA Jet Propulsion Laboratory. (2013). NASA Shuttle Radar Topography Mission (SRTM) Global 1 arc second, Version 3.0 (SRTMGL1 v003) [Data set]. NASA Land Processes DAAC. https://doi.org/10.5067/MEASURES/SRTM/SRTMGL1.003 Accessed via Google Earth Engine as USGS/SRTMGL1_003.