pola-rs/polars: Python Polars 0.20.27

Ritchie Vink; Stijn de Gooijer; Alexander Beedie; Marco Edward Gorelli; Weijie Guo; J van Zundert; Gert Hulselmans; Orson Peters; Cory Grinstead; Marshall; nameexhaustion; chielP; Matteo Santamaria; Daniël Heres; Josh Magarick; ibENPC; Itamar Turner-Trauring; Moritz Wilksch; Jorge Leitao; Mick van Gelderen; Petros Barbagiannis; Jonas Haag; Karl Genockey; Liam Brannigan; Marc van Heerden; Ion Koutsouris; Oliver Borchert; Chris Pryer; Ryan Russell

doi:10.5281/zenodo.11236815

Published May 21, 2024 | Version py-0.20.27

Software Open

pola-rs/polars: Python Polars 0.20.27

1. @pola-rs
2. Quansight
3. @alibaba
4. @aertslab
5. University of California, Berkeley
6. @coralogix
7. @QuantCo
8. Munin Data ApS
9. forml.eu
10. ASML
11. @Quantco

⚠️ Deprecations

Change parameter chunked to allow_chunks in parametric testing strategies (#16264)

🚀 Performance improvements

Use branchless uleb128 decoding for parquet (#16352)
Reduce error bubbling in parquet hybrid_rle (#16348)
use is_sorted in ewm_mean_by, deprecate check_sorted (#16335)
Optimize is_sorted for numeric data (#16333)
do not use pyo3-built (#16309)
Faster bitpacking for Parquet writer (#16278)
Improve Series.to_numpy performance for chunked Series that would otherwise be zero-copy (#16301)
Further optimise initial polars import (#16308)
Avoid importing ctypes.util in CPU check script if possible (#16307)
Don't rechunk when converting DataFrame to numpy/ndarray (#16288)
use zeroed vec in ewm_mean_by for sorted fastpath (#16265)

✨ Enhancements

expose BooleanFunction in expr IR (#16355)
Allow read_excel to handle bytes/BytesIO directly when using the "calamine" (fastexcel) engine (#16344)
Raise when joining on the same keys twice (#16329)
Don't require data to be sorted by by column in rolling_*_by operations (#16249)
Support List types in Series.to_numpy (#16315)
Add to_jax methods to support Jax Array export from DataFrame and Series (#16294)
Enable generating data with time zones in parametric testing (#16298)
Add struct.field expansion (regex, wildcard, columns) (#16320)
Add new alpha, alphanumeric and digit selectors (#16310)
Faster bitpacking for Parquet writer (#16278)
Add require_all parameter to the by_name column selector (#15028)
Start updating BytecodeParser for Python 3.13 (#16304)
Add struct.with_fields (#16305)
Handle implicit SQL string → temporal conversion in the BETWEEN clause (#16279)
Expose string expression nodes to python (#16221)
Add new index/range based selector cs.by_index, allow multiple indices for nth (#16217)
Show warning if expressions are very deep (#16233)
Fix some issues in parametric testing with nested dtypes (#16211)

🐞 Bug fixes

pick a consistent order for the sort options in PyIR (#16350)
Infer CSV schema as supertype of all files (#16349)
Fix issue in parametric testing where excluded_dtypes list would grow indefinitely (#16340)
Address overflow combining u64 hashes in Debug builds (#16323)
Don't exclude explicitly named columns in group-by context' expr expansion (#16318)
Improve map_elements typing (#16257)
Harden Series.reshape against invalid parameters (#16281)
Fix list.sum dtype for boolean (#16290)
Don't stackoverflow on all/any horizontal (#16287)
Fix Series.to_numpy for Array types with nulls and nested Arrays (#16230)
`rolling_*_by was throwing incorrect error when dataframe was sorted by contained multiple chunks (#16247)
Don't allow passing missing data to generalized ufuncs (#16198)
Address overly-permissive expand_selectors function, minor fixes (#16250)
Add missing support for parsing instantiated Object dtypes Object() (#16260)
Reading CSV with low_memory gave no data (#16231)
Add missing read_database overload (#16229)
Fix a rounding error in parametric test datetimes generation (#16228)
Fix some issues in parametric testing with nested dtypes (#16211)

📖 Documentation

Add missing word in join docstring (#16299)
document that month_start/month_end preserve the current time (#16293)
Add example for separator parameter in pivot (#15957)

📦 Build system

Fix allocator features (#16365)
Update Rust nightly toolchain version (#16222)

🛠️ Other improvements

Move DataFrame.to_numpy implementation to Rust side (#16354)
Organize PyO3 NumPy code into interop::numpy module (#16346)
simplify interpolate code, add test for rolling_*_by with nulls (#16334)
Very minor refactor of DataFrame.to_numpy code (#16325)
InterchangeDataFrame.version should be a ClassVar (not a property) (#16312)
Add polars-expr README (#16316)
Raise import timing test threshold (#16302)
Use cls (not self) in classmethods (#16303)
Use Scalar instead of Series some aggregations (#16277)
Do not hardcode bash path in Makefile (#16263)
Add IR::Reduce (not yet implemented) (#16216)
move all describe, describe_tree and dot-viz code to IR instead of DslPlan (#16237)

Thank you to all our contributors for making this release possible! @MarcoGorelli, @NickCondron, @ShivMunagala, @alexander-beedie, @brandon-b-miller, @coastalwhite, @datenzauberai, @itamarst, @jsarbach, @max-muoto, @nameexhaustion, @orlp, @r-brink, @ritchie46, @stinodego, @thalassemia, @twoertwein and @wence-

Files

pola-rs/polars-py-0.20.27.zip

Files (4.6 MB)

Name	Size	Download all
pola-rs/polars-py-0.20.27.zip md5:c54db2e770adc202a37f0904f639584b	4.6 MB	Preview Download

Additional details

Is supplement to: Software: https://github.com/pola-rs/polars/tree/py-0.20.27 (URL)

	All versions	This version
Views	14,211	30
Downloads	1,925	11
Data volume	8.0 GB	50.3 MB

pola-rs/polars: Python Polars 0.20.27

Authors/Creators

Description

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Files

pola-rs/polars-py-0.20.27.zip

Files (4.6 MB)

Additional details

Related works