GM-SEUS 2025: A harmonized dataset of ground-mounted solar energy in the US with enhanced metadata
Authors/Creators
Description
Ground-Mounted Solar Energy in the United States (GM-SEUS v2.0) 2025
Abstract
Solar energy generating systems are critical components of our expanding energy infrastructure, yet available datasets remain incomplete or not publicly available–particularly at the sub-array level. Combining the best open-access datasets in the US with image analysis on freely available remotely-sensed imagery, we present the Ground-Mounted Solar Energy in the United States (GM-SEUS) dataset, a harmonized, open access geospatial and temporal repository of solar energy arrays and panel-rows.
GM-SEUS v2.0 includes 18,980 commercial- and utility-scale ground-mounted solar photovoltaic and concentrating solar energy arrays (292 GW), spanning 3,817 km². Of these, 12,739 arrays (112 GW) include detailed panel-row geometries, comprising 3.43 million unique solar panel rows across 527 km². When restricted to arrays verifiable through high-resolution satellite and aerial imagery (hand-delineated spatial data sources and/or containing panel-row information ~ low commission error), the dataset contains 15,744 arrays, representing 204 GW and 2,586 km².
We use these newly compiled and delineated solar arrays and panel-rows to harmonize and independently estimate value-added attributes to existing datasets including installation year, azimuth, mount technology, panel-row area and dimensions, inter-row spacing, ground cover ratio, tilt, and installed capacity. By estimating and harmonizing these attributes of the distributed US solar energy landscape, GM-SEUS supports diverse applications in renewable energy modeling, ecosystem service assessment, and infrastructure planning.
Technical info
This is the data repository for creating and maintaining the Ground-Mounted Solar Energy in the United States v2.0 (GM-SEUS 2025) spatiotemporal dataset of solar arrays and panel-rows using existing datasets, machine learning, and object-based image analysis to enhance existing sources. Contents of this repository are described here briefly, with the attatched data README providing more detailed descriptions. The source Github Repository for generating this dataset can be found here. The related paper was published in Scientific Data.
This is the release of GM-SEUS (version 2.0). All input datasets and solar panel-row delineation results are up-to-date through November 7th, 2025.
Primary Repository Contents Include:
GMSEUS_Arrays_Final_2025_v2_0: Final array dataset containing array boundaries from existing datasets and enhanced by buffer-dissolve-erode technique with GM-SEUS panel-rows containing all array-level attributes (ESPG:6350), geopackage, shapefile, and comma separated values
GMSEUS_Panels_Final_2025_v2_0: Final panel-row dataset containing boundaries from existing datasets and newly delineated GM-SEUS panel-rows containing all panel-row-level attributes (ESPG:6350), geopackage, shapefile, and comma separated values
GMSEUS_NAIP_Arrays_2025_v2_0: All array boundaries created by buffer-dissolve-erode method of newly delineated (NAIP) GM-SEUS panel-rows (ESPG:6350), geopackage, shapefile, and comma separated values
GMSEUS_NAIP_Panels_2025_v2_0: All newly delineated panel-row boundaries (ESPG:6350), geopackage, shapefile, and comma separated values
GMSEUS_NAIP_PanelsNoQAQC_2025_v2_0: All newly delineated panel-rows from NAIP imagery without any quality control (ESPG:6350), geopackage, shapefile, and comma separated values
GMSEUS_RooftopArrays_2025_v2_0: Arrays from input source datasets reported or determined to be rooftop arrays by intersection with OpenBuildingMap building footprints. Non-exhaustive, intended as an exclusionary resource to maintain GM-SEUS ground-mounted status (EPSG:6350), geopackage, shapefile, and comma separated values
NAIPtrainRF: Training dataset of 12,000 NAIP training points (2,000 per class) containing class values, spectral index values, the year of NAIP imagery accessed, and point coordinates (WGS84), comma separated values
NAIPclassifyRF: Random forest classifier trees and weights as output from Google Earth Engine classifier, comma separated values
LabeledImages: Directory containing image and mask subdirectories with ~17,500 input and target images for deep learning pattern recognition applications, GeoTIFF
* NOTE: As of v2.0, NAIPtrainRF, NAIPclassifyRF, and LabeledImages have not been updated beyond v1.0.
Disclaimer:
This dataset provides a broad characterization of solar array design practices. Any characterization of solar array design and management derived from remote sensing imagery should be considered with extreme scrutiny given the limitations of such approaches. While our work fills a critical data gap and compiles and enhances existing high-fidelity datasets, the design practices reported here are thus subject to uncertainty and should not be used to represent actual conditions at individual sites. No warranty is expressed or implied regarding accuracy, completeness or fitness for a specific purpose. We publish this dataset in open access, for the broader science community, policy makers, and stakeholders in addressing questions about the existing renewable energy landscape and do not consent to this data being used to target, identify, or make claims about individual arrays, properties, or entities. Any such use case is strictly prohibited.
GM-SEUS is released under CC-BY 4.0. However, components derived from third-party datasets retain the original license of those inputs. Some upstream datasets used in boundary generation contain non-commerical (NC) licensing terms. As a result, users intending to reuse GM-SEUS for commercial purposes must ensure compliance with the licensing conditions of those upstream sources. GM-SEUS does not incorporate metadata or attribute information from non-commercial datasets. However, certain geometry or inferred boundaries may constitute derivative works of those sources. To support transparency, GM-SEUS retains the original spatial data source in the Source attribute column, and full upstream licensing information is provided in the accompanying sourceDataLicenses.csv file.
Files
Documentation.zip
Additional details
Funding
- United States Department of Agriculture
- INFEWS/T1: Developing Pathways Toward Sustainable Irrigation across the United States Using Process-based Systems Models (SIRUS) 2018-67003-27406
- Michigan State University
- Climate Change Research Support Program: Building a Foundation for SCALE (Sustainable Communities, Agriculture, Landscapes, and Energy)
- Foundation for Food and Agriculture Research
- Using Solar Panels to Enhance Groundwater 23-000780
Dates
- Collected
-
2025-11-07All source datasets and imagery are up to date.
Software
- Repository URL
- https://github.com/stidjaco/GMSEUS
- Programming language
- Python , JavaScript
- Development Status
- Active
References
- Stid, J.T., Kendall, A.D., Anctil, A., Rapp, J., Bingaman, J.C., Hyndman, D.W. A harmonized dataset of ground-mounted solar energy in the US with enhanced metadata. Sci Data 12, 1586 (2025). https://doi.org/10.1038/s41597-025-05862-4