Published May 21, 2024
| Version py-0.20.27
Software
Open
pola-rs/polars: Python Polars 0.20.27
Authors/Creators
- Ritchie Vink
- Stijn de Gooijer1
- Alexander Beedie
- Marco Edward Gorelli2
- Weijie Guo3
- J van Zundert
- Gert Hulselmans4
- Orson Peters
- Cory Grinstead
- Marshall
- nameexhaustion
- chielP
- Matteo Santamaria5
- DaniΓ«l Heres6
- Josh Magarick
- ibENPC
- Itamar Turner-Trauring
- Moritz Wilksch7
- Jorge Leitao8
- Mick van Gelderen
- Petros Barbagiannis
- Jonas Haag9
- Karl Genockey
- Liam Brannigan
- Marc van Heerden
- Ion Koutsouris10
- Oliver Borchert11
- Chris Pryer
- Ryan Russell
- 1. @pola-rs
- 2. Quansight
- 3. @alibaba
- 4. @aertslab
- 5. University of California, Berkeley
- 6. @coralogix
- 7. @QuantCo
- 8. Munin Data ApS
- 9. forml.eu
- 10. ASML
- 11. @Quantco
Description
β οΈ Deprecations
- Change parameter
chunkedtoallow_chunksin parametric testing strategies (#16264)
π Performance improvements
- Use branchless uleb128 decoding for parquet (#16352)
- Reduce error bubbling in parquet hybrid_rle (#16348)
- use is_sorted in ewm_mean_by, deprecate check_sorted (#16335)
- Optimize
is_sortedfor numeric data (#16333) - do not use pyo3-built (#16309)
- Faster bitpacking for Parquet writer (#16278)
- Improve
Series.to_numpyperformance for chunked Series that would otherwise be zero-copy (#16301) - Further optimise initial
polarsimport (#16308) - Avoid importing
ctypes.utilin CPU check script if possible (#16307) - Don't rechunk when converting DataFrame to numpy/ndarray (#16288)
- use zeroed vec in ewm_mean_by for sorted fastpath (#16265)
β¨ Enhancements
- expose BooleanFunction in expr IR (#16355)
- Allow
read_excelto handle bytes/BytesIO directly when using the "calamine" (fastexcel) engine (#16344) - Raise when joining on the same keys twice (#16329)
- Don't require data to be sorted by
bycolumn inrolling_*_byoperations (#16249) - Support List types in
Series.to_numpy(#16315) - Add
to_jaxmethods to support Jax Array export fromDataFrameandSeries(#16294) - Enable generating data with time zones in parametric testing (#16298)
- Add struct.field expansion (regex, wildcard, columns) (#16320)
- Add new
alpha,alphanumericanddigitselectors (#16310) - Faster bitpacking for Parquet writer (#16278)
- Add
require_allparameter to theby_namecolumn selector (#15028) - Start updating
BytecodeParserfor Python 3.13 (#16304) - Add
struct.with_fields(#16305) - Handle implicit SQL string β temporal conversion in the
BETWEENclause (#16279) - Expose string expression nodes to python (#16221)
- Add new index/range based selector
cs.by_index, allow multiple indices fornth(#16217) - Show warning if expressions are very deep (#16233)
- Fix some issues in parametric testing with nested dtypes (#16211)
π Bug fixes
- pick a consistent order for the sort options in PyIR (#16350)
- Infer CSV schema as supertype of all files (#16349)
- Fix issue in parametric testing where
excluded_dtypeslist would grow indefinitely (#16340) - Address overflow combining u64 hashes in Debug builds (#16323)
- Don't exclude explicitly named columns in group-by context' expr expansion (#16318)
- Improve
map_elementstyping (#16257) - Harden
Series.reshapeagainst invalid parameters (#16281) - Fix list.sum dtype for boolean (#16290)
- Don't stackoverflow on all/any horizontal (#16287)
- Fix
Series.to_numpyfor Array types with nulls and nested Arrays (#16230) - `rolling_*_by was throwing incorrect error when dataframe was sorted by contained multiple chunks (#16247)
- Don't allow passing missing data to generalized ufuncs (#16198)
- Address overly-permissive
expand_selectorsfunction, minor fixes (#16250) - Add missing support for parsing instantiated Object dtypes
Object()(#16260) - Reading CSV with low_memory gave no data (#16231)
- Add missing
read_databaseoverload (#16229) - Fix a rounding error in parametric test datetimes generation (#16228)
- Fix some issues in parametric testing with nested dtypes (#16211)
π Documentation
- Add missing word in
joindocstring (#16299) - document that month_start/month_end preserve the current time (#16293)
- Add example for separator parameter in pivot (#15957)
π¦ Build system
- Fix allocator features (#16365)
- Update Rust nightly toolchain version (#16222)
π οΈ Other improvements
- Move
DataFrame.to_numpyimplementation to Rust side (#16354) - Organize PyO3 NumPy code into
interop::numpymodule (#16346) - simplify interpolate code, add test for rolling_*_by with nulls (#16334)
- Very minor refactor of
DataFrame.to_numpycode (#16325) InterchangeDataFrame.versionshould be aClassVar(not aproperty) (#16312)- Add
polars-exprREADME (#16316) - Raise import timing test threshold (#16302)
- Use
cls(notself) in classmethods (#16303) - Use Scalar instead of Series some aggregations (#16277)
- Do not hardcode bash path in Makefile (#16263)
- Add IR::Reduce (not yet implemented) (#16216)
- move all describe, describe_tree and dot-viz code to IR instead of DslPlan (#16237)
Thank you to all our contributors for making this release possible! @MarcoGorelli, @NickCondron, @ShivMunagala, @alexander-beedie, @brandon-b-miller, @coastalwhite, @datenzauberai, @itamarst, @jsarbach, @max-muoto, @nameexhaustion, @orlp, @r-brink, @ritchie46, @stinodego, @thalassemia, @twoertwein and @wence-
Files
pola-rs/polars-py-0.20.27.zip
Files
(4.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:c54db2e770adc202a37f0904f639584b
|
4.6 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/pola-rs/polars/tree/py-0.20.27 (URL)