Published August 19, 2024 | Version v2024.8.0
Dataset Open

Public Utility Data Liberation Project (PUDL) Data Release

Description

PUDL v2024.8.0 Data Release

This is our regular quarterly release for 2024Q3. It includes quarterly updates to all datasets that are updated with quarterly or higher frequency by their publishers, including EIA-860M, EIA-923 (YTD data), EIA-930, the EIA’s bulk electricity API data (used to fill in missing fuel prices), and the EPA CEMS hourly emissions data.

Annual datasets which have been published since our last quarterly release have also been integrated. These include FERC Forms 1, 2, 6, 60, and 714, and the NREL ATB.

This release also includes provisional versions of the annual 2023 EIA-860 and EIA-923 datasets, whose final release will not happen until the fall.

New Data Coverage

FERC Form 1

  • Integrated FERC Form 1 data from 2023 into the main PUDL SQLite DB. See issue #3700 and PR #3701. This required updating to a new version of the catalystcoop.ferc_xbrl_extractor package because there are now multiple XBRL taxonomies in use by FERC in different years, or even within the same year. See this PR for more details, as well as issue #3544 and PR #3710.

FERC Forms 2, 6, 60, & 714

  • Updated the ferc_to_sqlite settings to extract 2023 XBRL data for FERC Forms 2, 6 60, and 714 and add them to their respective SQLite databases. Note that this data is not yet being processed beyond the conversion from XBRL to SQLite. See PR #3710

EIA AEO

EIA 860

  • Added EIA 860 early release data from 2023. This included adding a new tab with proposed energy storage generators as well as adding a number of new columns regarding energy storage and solar generators. See issue #3676 and PR #3681.

  • Added EIA 860m data through June 2024. See issue #3759 and PR #3767.

EIA 923

  • Added EIA 923 early release data from 2023. See #3719 and PR #3721.

  • Added EIA 923 monthly data through May as part of the Q2 quarterly release. See #3760 and #3768.

EIA 930

  • Added EIA 930 hourly data through the end of July as part of the Q2 quarterly release. See #3761 and #3789.

EPA CEMS

  • Added 2024 Q2 of CEMS data. See #3762 and #3769.

EIA Bulk Electricity Data

  • Updated the EIA Bulk Electricity data archive to include data that was available as of 2024-08-01, which covers up through 2024-05-01 (3 months more than the previously used archive). See #3763 and PR #3785.

FERC 714

NREL ATB

  • Added 2024 NREL ATB data. This includes adding a new tax credit case, model_tax_credit_case_nrelatb, a breakout of capex_grid_connection_per_kw for all technologies, and more detailed nuclear breakdowns of fuel_cost_per_mwh. Simultaneously, updated the docs.dev.existing_data_updates documentation to make it easier to add future years of data. See #3706 and #3719.

  • Updated NREL ATB data to include error corrections in the 2024 data. See #3777 and PR #3778.

Data Cleaning

  • When generator_operating_date values are too inconsistent to be harvested successfully, we now take the last reported date in EIA 860 and 860M. See #423 and PR #3967.

  • Added the generator_operating_date field into core_eia860m__changelog_generators, adding 860M reported generator operating dates into the changelog table. This table is not harvested, and thus does not affect the generator_operating_date values reported in other core EIA tables. See #3722 and PR #3751.

Bug Fixes

  • Disabled filling of missing values using rolling averages for the fuel_cost_per_mmbtu column in the out_eia923__fuel_receipts_costs table, as it was resulting in some anomlously high fuel prices. See #3716. This results in about 2% more records in the table being left NA after filling with the average prices for that fuel type for the state and month found in the bulk EIA API data.

Quality of Life Improvements

  • The full ETL settings are now read directly from etl_full.yml instead of using default values defined in the settings classes. This also results in the settings showing up in the Dagster UI Launchpad, which previously they didn’t, leading to confusion when trying to re-run the FERC to SQLite conversions. See #3710.

  • mlflow experiment tracking has been disabled by default when running the DAG, since it is only really helpful during development of new record linkage or other ML workflows. See #3710.

Other PUDL v2024.8.0 Resources

Contact Us

If you're using PUDL, we would love to hear from you! Even if it's just a note to let us know that you exist, and how you're using the software or data. Here's a bunch of different ways to get in touch:

Files

censusdp1tract.sqlite.zip

Files (10.8 GB)

Name Size Download all
md5:c12f90fd4b19c0c66a7f7e44be17fb76
6.6 MB Download
md5:ad274d8371a45233a1379ba4bd71ef17
506.7 MB Preview Download
md5:1565a3c9c7c4229beb2557bdc3afeec7
275.5 MB Preview Download
md5:f07004b705cd6fe944b5d413c2644f1b
143.0 MB Preview Download
md5:5a058b78f0f382a89f807db19da0afd0
2.3 MB Preview Download
md5:e2ea5445b402a8c021ed9e4b806dd272
7.3 MB Preview Download
md5:9801528b27adc333bbec3ceb6af79217
74.5 MB Preview Download
md5:b7913f2adcbedce392ee70e62bc22330
20.0 MB Preview Download
md5:e85d0e508a05dff1caa290a39b789b13
2.6 MB Preview Download
md5:e7de3eb187c894a7cfb1dc6506d7d6da
7.2 MB Preview Download
md5:a6d548856007cec1155e98a3a7a72a7d
2.9 MB Preview Download
md5:b3c00916b489d645468110e410247926
3.3 MB Preview Download
md5:489283989c2d5d0a371f0d355d5302a5
964.8 kB Preview Download
md5:3761d6fc6a9e4901476793076d338ab4
1.9 MB Preview Download
md5:69226b5486820ad1962e9a1662a95260
43.9 MB Preview Download
md5:dd573c7fb77e143744eb9cb50dfdc5d7
16.1 MB Preview Download
md5:b9c50256598d7f375ecf4ad212b92cb2
1.4 MB Preview Download
md5:5dc4026532593b8f4dab3556345f2c1a
2.9 MB Preview Download
md5:3d463973f0afb15050fa7a49b4e1eaa2
157.7 MB Preview Download
md5:7854e7524aec07262b077a2e05393413
85.4 kB Preview Download
md5:9fef935f9a970839319ada082d6c9672
192.4 kB Preview Download
md5:a3eb84656f007596e933efbb19fe7171
2.5 GB Preview Download
md5:70286e70a2a1dfaa08187709b7c5de11
7.0 GB Preview Download