Public Utility Data Liberation Project (PUDL) Data Release
Creators
- 1. Catalyst Cooperative
Description
PUDL v2025.5.0 (2025-05-20)
This is our regular quarterly PUDL data release for 2025Q2. It includes sub-annual updates to the EIA-860M, EIA-923, EIA-930, EIA bulk electricity API, and EPA CEMS datasets. It also includes preliminary 2024 data for FERC Form 1 (integrated into PUDL) and FERC Forms 2, 6, and 60 (as stand-alone SQLite databases). The VCE RARE hourly county-level renewable energy generation curves have been extended back to cover 2014-2018.
This release also includes new imputed versions of the FERC-714 and EIA-930 hourly demand curves with missing values filled in and a better organized verion of the SEC 10-K company ownership data. Note that work on the demand imputations and SEC 10-K data is ongoing.
All federal data was archived from the publishing agencies on May 1st, 2025.
Upcoming Deprecations
-
Due to the growing size of PUDL database, we are no longer updating our Datasette deployment and that URL will soon begin redirecting users to the PUDL Data Viewer. You can track our progress toward feature parity with the old Datasette deployment in this issue.
-
When we complete the migration of our data validation tests to the
dbt
framework, we will remove the deprecatedpudl.output.pudltabl.PudlTabl
output class. This will also happen before our next quarterly release.
New Data
FERC 714
-
We refactored our timseries imputation functions to be more generalized and reusable, so they can be applied to electricity demand curves from both FERC-714 and EIA-930, as well as other time series data in the future. This resulted in some minor changes to the imputation results. See issue #4112 and PR #4113.
-
Added the table out_ferc714__hourly_planning_area_demand, which contains an imputed version of demand. Previously these imputed values were not being distributed directly, and fed into the out_ferc714__hourly_estimated_state_demand table.
EIA 930
Work on producing EIA 930 demand curves suitable for use in electricity system modeling is being done in collaboration with @awongel at Carnegie Science, with support from GridLab. See issue #4083 for a list of related issues.
-
Added the table out_eia930__hourly_subregion_demand, which contains an imputed version of subregion demand. See issues #4124, #4136 and PR #4149
-
Added the table out_eia930__hourly_operations, which contains an imputed version of BA level demand. See issue #4138 and PR #4162
SEC 10-K
-
Reorganized the preliminary SEC 10-K data that was integrated into our last release. See issue #4078 and PR #4134. The SEC 10-K tables are now more fully normalized and better conform to existing PUDL naming conventions. Overall revision of the SEC 10-K data is being tracked in issue #4085.
Note that the SEC 10-K data is still a work in progress, and there are known issues that remain to be resolved in the upstream repository that generates this data.
The new tables include:
Expanded Data Coverage
FERC Form 1
-
Integrated FERC Form 1 data from 2024 into the main PUDL SQLite DB. See issue #4207 and PR #4215. FERC Form 1 has a filing deadline of April 18th for utility respondents, but late filings may come throughout the year. This update includes ~95% of the expected utility responses for 2024.
FERC Forms 2, 6, & 60
-
Updated the FERC archive DOIs and
ferc_to_sqlite
settings to extract 2024 XBRL data for FERC Forms 2, 6, and 60 and add them to their respective SQLite databases. Note that this data is not yet being processed beyond the conversion from XBRL to SQLite. See PR #4250. The reporting deadline for these forms was April 18th, 2025 so they should include the vast bulk of the expected data, however there may be some late filings which will be added in the next quarterly release.
EIA Bulk Electricity
-
Updated the EIA Bulk Electricity data to include data published up through 2025-05-01. Also adapted the extractor to handle changes in formatting for the EIA Bulk API archive. See #4237 and PR #4246.
EPA CEMS
EIA 930
-
Updated EIA 930 to include data published up through the beginning of May 2025. See #4235 and #4242. Raw data now includes adjusted and imputed values for the
unknown
fuel source, making it behave like other fuel sources; see Changes in energy source granularity over time for more information.
EIA 860M
EIA 923
VCE RARE
-
Integrated 2014-2018 RARE data into PUDL. Also fixed misleading latitude and longitude field descriptions, and renamed the field
county_or_lake_name
toplace_name
. See issue #4226 and PR #4239.
Bug Fixes
-
Fixed a bug in FERC XBRL extraction that led to quietly skipping tables with names that didn’t conform to expected format. The only known table affected was in the FERC Form 6. See issue #4203 and PRs #4224 and catalyst-cooperative/ferc-xbrl-extractor #320.
-
As part of #4215 we fixed a bug introduced in the last release that was causing most values in the
out_ferc1__yearly_rate_base
table to be dropped. See this commit.
Quality of Life Improvements
-
We now publish a Frictionless data package describing our Parquet outputs, with the name
pudl_datapackage.json
. See #4069 and #4070. -
We renamed
eia_bulk_elec
toeiaapi
to conform to our dataset naming protocols and reflect the expansion of the EIA Bulk API archive to include all datasets published through the EIA API, not just the bulk electricity data. See this PUDL archiver issue and PR #4212. -
To improve human readability, we added
utility_id_pudl
andutility_name_ferc1
columns to a number of derived FERC 1 output tables including:See PR #4260.
New Tests
We’re in the process of migrating hundrds of data validation tests to use the dbt framework. We have converted at least the following classes of tests:
-
check_column_correlation
– a more generic replacement for the oldtest_fbp_ferc1_mmbtu_cost_correlation
pytest. See #4094, #4117. You can find the implementation in the check_column_correlation.sql file. -
expect_includes_all_value_combinations_from
- a more generic replacement for the oldensure_all_ppe_ids_are_in_assn
pytest. See #4096, #9123. You can find the implementation in the expect_includes_all_value_combinations_from.sql file. -
expect_quantile_constraints
- a more generic replacement for the oldvs_bounds
pytest. See #4106, #4090, and #4171. You can find the implementation in the expect_quantile_constraints.sql file. -
19 tests which required special handling; see #4093, #4114, #4151.
Other PUDL v2025.5.0 Resources
- PUDL v2025.5.0 Data Dictionary
- PUDL v2025.5.0 Documentation
- PUDL in the AWS Open Data Registry
- PUDL v2025.5.0 in a free, public AWS S3 bucket: s3://pudl.catalyst.coop/v2025.5.0/
- PUDL v2025.5.0 in a requester-pays GCS bucket: gs://pudl.catalyst.coop/v2025.5.0/
- Zenodo archive of the PUDL GitHub repo for this release
- PUDL v2025.5.0 release on GitHub
- PUDL v2025.5.0 package in the Python Package Index (PyPI)
Contact Us
If you're using PUDL, we would love to hear from you! Even if it's just a note to let us know that you exist, and how you're using the software or data. Here's a bunch of different ways to get in touch:
- Follow us on GitHub
- Use the PUDL Github issue tracker to let us know about any bugs or data issues you encounter
- GitHub Discussions is where we provide user support.
- Watch our GitHub Project to see what we're working on.
- Email us at hello@catalyst.coop for private communications.
- On Mastodon: @CatalystCoop@mastodon.energy
- On BlueSky: @catalyst.coop
- On Twitter: @CatalystCoop
- Connect with us on LinkedIn
- Play with our data and notebooks on Kaggle
- Combine our data with ML models on HuggingFace
- Learn more about us on our website: https://catalyst.coop
- Subscribe to our announcements list for email updates.
Files
censusdp1tract.sqlite.zip
Files
(15.4 GB)
Name | Size | Download all |
---|---|---|
md5:a68a69ed9d44ef47ecad473be5d3e225
|
7.8 MB | Download |
md5:f99b4a03cdfc70c45d6fb515a5b63cf6
|
506.7 MB | Preview Download |
md5:c56b23dc5814611af55f7343eead523f
|
275.5 MB | Preview Download |
md5:6eff5d831fad9d74bfab4f680d89ec62
|
186.8 MB | Preview Download |
md5:109d6609d0efa03befc023867b282166
|
2.3 MB | Preview Download |
md5:02c69518e60b733979671e73d2e12f9f
|
7.3 MB | Preview Download |
md5:e88c0fe72337894e5df915cc13daadc2
|
74.5 MB | Preview Download |
md5:bcee72e544dd2acceb314928e208bcd4
|
26.5 MB | Preview Download |
md5:8a4d9aea32e2c13efaca78c41fb4f1f8
|
2.6 MB | Preview Download |
md5:086bdfaeb6293ed05e3bc6748f0f9e6c
|
7.2 MB | Preview Download |
md5:b8d402cbb73412d13aedd14019e2a77f
|
2.9 MB | Preview Download |
md5:fc174a50124b99eebb6da63a2e155985
|
4.2 MB | Preview Download |
md5:a2e6cebd4687bda8aec22732bb4720bd
|
964.8 kB | Preview Download |
md5:4667b8113e5e6d61af554a6304bf5ebc
|
1.9 MB | Preview Download |
md5:bb966021419bed66846e38c337775c0e
|
43.9 MB | Preview Download |
md5:49ea5ff5239b2ada1b2826db10ece0d1
|
21.8 MB | Preview Download |
md5:a8eb9ca16da061abd386643ffddefe17
|
1.4 MB | Preview Download |
md5:22324cdc278878cd884e4caf9971d903
|
3.0 MB | Preview Download |
md5:130400bc8ff787e32a74aaf4b608b0ff
|
157.7 MB | Preview Download |
md5:e97862d389cf64e745dc5fdd510e6af0
|
85.4 kB | Preview Download |
md5:9fef935f9a970839319ada082d6c9672
|
192.4 kB | Preview Download |
md5:9fe81b09d8cab418efed8c6ffa854e31
|
3.1 GB | Preview Download |
md5:a5a47c1454e11d34d67a1efa7964ae57
|
11.0 GB | Preview Download |