Global Facility-Level Solar Photovoltaic Inventory with Energy Generation and Loss Estimates
Authors/Creators
Contributors
Project members:
Description
Overview
This Zenodo release provides a global facility-level solar photovoltaic (solar PV) inventory with facility-scale energy generation and aerosol-related loss estimates, prepared alongside the manuscript: "Coal plants persist as a large barrier to the global solar energy transition" Nature Sustainability.
The dataset was generated using the framework described in the manuscript and Methods. In brief, a three-step workflow was used: (1) identify candidate PV facilities globally by combining existing inventories, crowd-sourced records, and a CNN-based scan of Sentinel-2 imagery; (2) extract precise panel footprints from confirmed sites using SAM-based segmentation; and (3) integrate the resulting footprints with MERRA-2 atmospheric reanalysis and a validated PV model to estimate facility-level generation and losses from clouds and aerosols.
The release includes PV facility footprints and core attributes:
PV_ID, latitude, longitude, country, year, area_m2.
In the main inventory files, year is the PV facility build/commissioning year (installation year), estimated from Sentinel-2 time-series classification as described in the manuscript Methods.
It contains two complementary data components:
- A global geospatial PV facility inventory (`.gpkg`, `.csv`, `.parquet`).
- Annual facility-level PV generation/loss tables (PV_facility_generation_year_YYYY.csv, currently 2017-2023).
Package Contents
- `global_pv_facility_inventory.gpkg`
- Layer: `global_pv_facility_inventory`
- Geometry: `MultiPolygon`
- CRS: `EPSG:4326` (WGS 84)
- `global_pv_facility_inventory.csv`
- Attribute-only table (no geometry)
- `global_pv_facility_inventory.parquet`
- GeoParquet (geometry + attributes)
- `Year-specific facility-level generation/loss tables (top-level CSV files)`
- Generated to support the manuscript analysis of facility-level PV energy generation and losses.
- For each year-specific file, analysis includes only facilities installed by that year; therefore facility counts differ across years.
- Year-specific facility-level generation/loss tables:
- `PV_facility_generation_year_2017.csv`
- `PV_facility_generation_year_2018.csv`
- `PV_facility_generation_year_2019.csv`
- `PV_facility_generation_year_2020.csv`
- `PV_facility_generation_year_2021.csv`
- `PV_facility_generation_year_2022.csv`
- `PV_facility_generation_year_2023.csv`
- Each file includes:
- Core facility columns: `PV_ID`, `latitude`, `longitude`, `country`, `year`, `area_m2`
- `power_POA (kWh)`: power generation estimated from plane-of-array (POA) irradiance.
- `power_POA_clr (kWh)`: POA-based power generation under clear-sky (cloud-free) conditions.
- `power_POA_cln (kWh)`: POA-based power generation under clean-sky (aerosol-free) conditions.
- `aerosol_loss (kWh)`: facility-level aerosol-related energy loss, computed as `power_POA (kWh) - power_POA_cln (kWh)`.
How to Use This Dataset (Technical)
- If you need geometry, use:
- `global_pv_facility_inventory.gpkg` (GIS-friendly)
- `global_pv_facility_inventory.parquet` (fast analytics with geometry)
- If you need tabular attributes only, use:
- `global_pv_facility_inventory.csv`
- For energy generation/loss analysis, use:
- `PV_facility_generation_year_YYYY.csv` (currently 2017-2023)
- Linkages:
- `PV_ID` is the facility identifier across all files.
- `year` supports year-specific filtering and aggregation.
How This Dataset Is Used in the Paper
- To map and quantify global facility-level PV deployment (location, footprint area, and installation year).
- To estimate facility-level PV generation from POA irradiance under:
- all-sky conditions (`power_POA (kWh)`),
- clear-sky conditions (`power_POA_clr (kWh)`),
- clean-sky conditions (`power_POA_cln (kWh)`).
- To quantify aerosol-related generation loss at facility level (`aerosol_loss (kWh)`), then aggregate by geography/year for manuscript analysis.
Potential Reuse in Other Research
- National/regional assessments of aerosol impacts on PV generation.
- Benchmarking climate and air-quality penalties for existing PV fleets.
- Integration with grid, policy, or emissions datasets for energy-transition studies.
- Geospatial analyses linking PV siting patterns with environmental and socioeconomic variables.
Snapshot Statistics
- Facilities: 140,945
- Countries: 181
- Inventory years: 2017-2024
- Generation/loss tables: 2017-2023
- Latitude range: 41.61° S to 68.38° N
Contact
- Dr. Rui Song: (rui.song@physics.ox.ac.uk); or (rui.song90@gmail.com)
Files
global_pv_facility_inventory.csv
Files
(1.3 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:780052777692c776e7c29f7acf1b4680
|
10.7 MB | Preview Download |
|
md5:eda08d47a5a46996892c188b6dce279c
|
815.9 MB | Download |
|
md5:987ba41d05961ec0334ef9ca6dd4513f
|
324.8 MB | Download |
|
md5:244a21d479860fc0a8c87d7f73f8e83b
|
9.3 MB | Preview Download |
|
md5:8f18cf96f67703aad8b4ca83e41f97eb
|
11.1 MB | Preview Download |
|
md5:6c6045f24e2daa7fc6a268289fc749d9
|
12.7 MB | Preview Download |
|
md5:67f92af5de1561e8a996d92cad4da212
|
14.3 MB | Preview Download |
|
md5:ac1f1f790292653824dded69eb3589b1
|
15.8 MB | Preview Download |
|
md5:4451bf9b5764a98b133d7e7511e1ef44
|
17.0 MB | Preview Download |
|
md5:4d2e227f9507369f42cc9de18e7d0ee6
|
19.1 MB | Preview Download |