pola-rs/polars: Python Polars 1.6.0

Ritchie Vink; Stijn de Gooijer; Alexander Beedie; Marco Edward Gorelli; Weijie Guo; J van Zundert; Orson Peters; nameexhaustion; Gert Hulselmans; Cory Grinstead; Gijs Burghoorn; Marshall; chielP; Itamar Turner-Trauring; Matteo Santamaria; Lawrence Mitchell; Daniël Heres; Josh Magarick; ibENPC; Karl Genockey; Henry Harbeck; Moritz Wilksch; deanm0000; Jorge Leitao; Mick van Gelderen; Petros Barbagiannis; Jonas Haag; Oliver Borchert; Ion Koutsouris

doi:10.5281/zenodo.13387733

Published August 28, 2024 | Version py-1.6.0

Software Open

pola-rs/polars: Python Polars 1.6.0

1. @pola-rs
2. Quansight
3. @alibaba
4. Polars
5. @aertslab
6. University of California, Berkeley
7. @coralogix
8. @QuantCo
9. Munin Data ApS
10. forml.eu
11. @Quantco
12. ASML

💥 Breaking changes

Use Altair in DataFrame.plot (#17995)

🚀 Performance improvements

Parquet do not copy uncompressed pages (#18441)
Several large parquet optimizations (#18437)
Batch Plain Parquet UTF-8 verification (#18397)
Partition metadata for parquet statistic loading (#18343)
Fix accidental quadratic parquet metadata (#18327)
Lazy decompress Parquet pages (#18326)
Don't rechunk aligned chunks in owned_binary_chunk_align (#18314)
Batch DELTA_LENGTH_BYTE_ARRAY decoding (#18299)
Slice pushdown for SimpleProjection (#18296)
Use direct path for time/timedelta literals (#18223)
Speedup ndjson reader ~40% (#18197)
Skip parquet page when unneeded (#18192)

✨ Enhancements

Use Altair in DataFrame.plot (#17995)
Allow mapping as syntactic sugar in str.replace_many (#18214)
Respect input time zone if input is pandas Timestamp (#18346)
Improve Schema and DataType interop with Python types (#18308)
Add POLARS_BACKTRACE_IN_ERR for debugging (#18333)
IR serde (#18298)
Improve decimal_comma error message (#18269)
Support pre-signed URLs for cloud scan (#18274)
Support the most recent version of "duckdb_engine" connections via read_database (#18277)
Support empty structs (#18249)
Allow float in interpolate_by by column (#18015)
Make show_versions more responsive (#18208)

🐞 Bug fixes

Enable CSE in eager if struct are expanded (#18426)
Treat explode as gather (#18431)
Parquet nested values that span several pages (#18407)
Support reading empty parquet files (#18392)
Recurse on map field during type conversion (#15075)
Allow search_sorted on boolean series (#18387)
Mark Expr.(lower|upper)_bound as returning scalar (#18383)
Fix compressed ndjson row count (#18371)
Use correct column names when there are no value columns in unpivot (#18340)
Parquet several smaller issues (#18325)
Fix group-by slice on all keys (#18324)
Compute joint null mask before calling rolling corr/cov stats (#18246)
Several scan_parquet(parallel='prefiltered') problems (#18278)
Json feature flag missing imports (#18305)
Check groups in group-by filter (#18300)
Parquet delta encoding for 0-bitwidth miniblocks (#18289)
Arguments for upsample only have to be sorted within groups (#18264)
Use appropriate bins in hist when bin_count specified (#16942)
Raise suitable error on unsupported SQL set op syntax (#18205)
Fix invalid state due to cached IR (#18262)
Fix failed AWS credential load from '~/.aws/credentials' due to formatting (#18259)
Fix panic streaming parquet scan from cloud with slice (#18202)
Consistently round half-way points down in dt.round (#18245)
Fix duplicate column output and panic for include_file_paths (#18255)
Fix unit null rank (#18252)
Use physical for row-encoding (#18251)
Convert date and datetime in literal construction (#16018)
Fix gather str as lit (#18207)

📖 Documentation

Add date_range and datetime_ranges examples without eager=True (#18379)
Fix incorrect comments in group_by_dynamic (#18415)
Alphabetise methods in Python API reference (#18380)
Document POLARS_BACKTRACE_IN_ERR env var (#18354)
Add missing aggregation entries (#18334) (#18341)
Add missing Series methods to API reference (#18312)
Document DataFrame.__getitem__ and Series.__getitem__ (#18309)
Fix typos and add see also links to struct name expressions (#18282)
Improve decimal_comma error message (#18269)
Clarify coalesce behaviour in join_asof (#18273)
Add note to Expr.shuffle differentiating from df method (#18266)
Improve formatting and consistency of various docstrings (#18237)
Add missing "Parameters" section to bin.size expr docstring (#18222)
Fix column name output in example of DataFrame.map_rows (#18227)

📦 Build system

Bump Rust toolchain to nightly-2024-08-26 (#18370)

🛠️ Other improvements

Address spurious hypothesis test failure (#18434)
Turn all Binary/Utf8 into BinaryView/Utf8View in Parquet (#18331)
Fix the required version of rust in README.md (#18357)
Remove unused Parquet indexes (#18329)
Deprecate serialize json for LazyFrame (#18283)
Don't add sink node to cloud query (#18280)
Split py-polars crate (#18204)
Fix test for new deltalake release (#18211)
Update the required version of rust in README.md (#18203)
Fix version bifurcation for test_read_database_cx_credentials (#18220)
Use or_else for raising (#18206)
Remove unused Parquet source files (#18193)

Thank you to all our contributors for making this release possible! @BartSchuurmans, @ChayimFriedman2, @MarcoGorelli, @StepfenShawn, @agossard, @alexander-beedie, @cgbur, @coastalwhite, @corwinjoy, @deanm0000, @henryharbeck, @ion-elgreco, @jqnatividad, @krasnobaev, @liufeimath, @markxwang, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stinodego, @sunadase, @thomascamminady and @wence-

Files

pola-rs/polars-py-1.6.0.zip

Files (5.0 MB)

Name	Size	Download all
pola-rs/polars-py-1.6.0.zip md5:a0eeb5dc3b740ae8ca0d1cb0192faf26	5.0 MB	Preview Download

Additional details

Is supplement to: Software: https://github.com/pola-rs/polars/tree/py-1.6.0 (URL)

	All versions	This version
Views	20,753	149
Downloads	3,693	22
Data volume	16.4 GB	109.9 MB

pola-rs/polars: Python Polars 1.6.0

Authors/Creators

Description

💥 Breaking changes

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Files

pola-rs/polars-py-1.6.0.zip

Files (5.0 MB)

Additional details

Related works