Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.
Published May 24, 2024 | Version v2024.5.0
Software Open

The Public Utility Data Liberation (PUDL) Project


What's Changed

New Data

Other Changes

  • Fix (more) v2024.02.03 release issues by @zaneselvans in #3346
  • Output Parquet files as well as SQLite in PUDL ETL by @zschira in #3296
  • Split monolithic ferc_to_sqlite ops into per-dataset pieces by @rousik in #3098
  • Add a simple test coverage check. by @zaneselvans in #3352
  • Add a simple pytest coverage check on workflow_dispatch or merge queue by @zaneselvans in #3371
  • Provide CodeCov token in pytest workflow. by @zaneselvans in #3374
  • Update docs + add release template by @jdangerx in #3361
  • Stop using live DB in unit tests!! by @jdangerx in #3377
  • Add sec10k metadata to sources by @zschira in #3378
  • Force --no-cov in nightly build by @jdangerx in #3382
  • Use context managers for opening zipfiles by @bendnorman in #3369
  • Update expected row count for EIA tables post 860m quarterly update by @aesharpe in #3380
  • Skip batch job if build was skipped as a whole. by @jdangerx in #3390
  • Update nightly build script to distribute parquet by @zaneselvans in #3399
  • Make an EIA860m Changelog table by @cmgosnell in #3331
  • Parametermize adding a column in the FERC1 transform & ensure _correction records end up in the calculation compoent table by @cmgosnell in #3409
  • Simplify pytest-cov configuration. by @jdangerx in #3391
  • Prototype dagster-pandera integration by @jdangerx in #3282
  • Fix small plants input table to FERC all plants table by @katie-lamb in #3415
  • Standardize process for merging tagged commits into persistent branches automatically by @zaneselvans in #3347
  • Restore individual FERC 1 plant output tables. by @zaneselvans in #3417
  • Experiment tracking by @zschira in #3289
  • Address loose ends in versioned release mechanics by @zaneselvans in #3421
  • Close out release notes for PUDL v2024.2.6 by @zaneselvans in #3427
  • Fix minor issues that arose in v2024.2.6 release by @zaneselvans in #3432
  • Harvest generator operating dates when they're within a year of one another by @e-belfer in #3419
  • Add RMI beta access to by @jdangerx in #3434
  • Add new citations of Catalyst / PUDL by @zaneselvans in #3435
  • Add BA codes and EIA sector IDs to EIA-860M changelog table by @zaneselvans in #3442
  • Very minor but widespread formatting changes from ruff 0.3.0 by @zaneselvans in #3445
  • Get multiple years of EIA 176/191/757A CSV data by @davidmudrauskas in #3402
  • Delete unused try/except Excel read-in method in pudl.extract.excel by @e-belfer in #3454
  • Update to improve full ETL instructions by @e-belfer in #3446
  • Fix broken links and rendering failure in PR template by @e-belfer in #3458
  • Add metadata for ATB, EIA 930 and AEO data by @e-belfer in #3474
  • Add PUDL citation for Grid Strategies load growth report. by @zaneselvans in #3483
  • Clean EIA 860 and 923 FGD operation and maintenance data by @e-belfer in #3403
  • Fix nightly build FK failure by @e-belfer in #3491
  • Add logline that tells us more about BadZipFile. by @jdangerx in #3493
  • Add total -> subtotal calculation correction & fix hard-coded plant-in-service table name by @cmgosnell in #3450
  • Fix indent error in nightly builds by @e-belfer in #3521
  • add two new correction records into plant_in_service table by @cmgosnell in #3525
  • Ferc1 rate base tag updates by @cmgosnell in #3517
  • Schema cleanup by @zaneselvans in #3529
  • Refactor etl/ to make adding new modules easier. by @jdangerx in #3539
  • Attempt to limit _out_ferc714__hourly_demand_matrix concurrency by @bendnorman in #3541
  • Manage concurrency of high-memory processes by @zaneselvans in #3543
  • Tag additional assets as high memory usage by @zaneselvans in #3548
  • Rename BA & Utility service territory tables to use conventions by @zaneselvans in #3552
  • Pin ferc-xbrl-extractor<1.4 to facilitate frictionless v5 update by @zaneselvans in #3566
  • Draft of package-level field encoding, applied to EIA by @zaneselvans in #3558
  • Get last non-null value instead of latest XBRL filing. by @jdangerx in #3545
  • Update expected row counts for FERC 1 tables by @zaneselvans in #3574
  • Create beta access SA's for gridpath and zerolab. by @jdangerx in #3577
  • Allow beta service accounts to access Parquet bucket by @jdangerx in #3586
  • Speed up nb-output-clear step in pre-commit by @jdangerx in #3591
  • Enumerate all AEO table 54 schemas. by @jdangerx in #3588
  • Fix quoting in hourly parquet deployment command by @zaneselvans in #3602
  • Remove unused resource keys from asset definitions by @zaneselvans in #3603
  • Stop ignoring test directory passed to pytest. by @jdangerx in #3610
  • Refactor EIA AEO totals checks. by @jdangerx in #3606
  • Clean up a couple warnings and remove obsolete materialize script. by @jdangerx in #3608
  • End use sectors generation by fuel type. by @jdangerx in #3598
  • Always clobber existing outputs in FERC to SQLite conversions by @zaneselvans in #3622
  • Update EIA AEO table description units to be consistent with columns. by @zaneselvans in #3626
  • NREL ATB - Stop dropping duplicate values before unstacking by @cmgosnell in #3630
  • Map new EIA plants and utilities with PUDL IDs for 2024Q1 update by @cmgosnell in #3636
  • Breakdown total utility_type and partial in_rate_base in rate base table by @cmgosnell in #3532
  • Update expected MCOE row counts by @zaneselvans in #3638
  • Add template that includes overview/success criteria/tasks by @jdangerx in #3640
  • Publish FERC1 Rate Base Table by @cmgosnell in #3641
  • Rate base category tweaks by @aesharpe in #3647
  • Organize the large new data section of release notes by @zaneselvans in #3652

Full Changelogv2024.02.04...v2024.5.0


If you use PUDL, please cite it as indicated below.



Files (37.4 MB)

Name Size Download all
37.4 MB Preview Download

Additional details

Related works