pola-rs/polars: Python Polars 1.6.0
Authors/Creators
- Ritchie Vink
- Stijn de Gooijer1
- Alexander Beedie
- Marco Edward Gorelli2
- Weijie Guo3
- J van Zundert
- Orson Peters4
- nameexhaustion
- Gert Hulselmans5
- Cory Grinstead
- Gijs Burghoorn4
- Marshall
- chielP
- Itamar Turner-Trauring
- Matteo Santamaria6
- Lawrence Mitchell
- DaniΓ«l Heres7
- Josh Magarick
- ibENPC
- Karl Genockey
- Henry Harbeck
- Moritz Wilksch8
- deanm0000
- Jorge Leitao9
- Mick van Gelderen
- Petros Barbagiannis
- Jonas Haag10
- Oliver Borchert11
- Ion Koutsouris12
- 1. @pola-rs
- 2. Quansight
- 3. @alibaba
- 4. Polars
- 5. @aertslab
- 6. University of California, Berkeley
- 7. @coralogix
- 8. @QuantCo
- 9. Munin Data ApS
- 10. forml.eu
- 11. @Quantco
- 12. ASML
Description
π₯ Breaking changes
- Use Altair in DataFrame.plot (#17995)
π Performance improvements
- Parquet do not copy uncompressed pages (#18441)
- Several large parquet optimizations (#18437)
- Batch Plain Parquet UTF-8 verification (#18397)
- Partition metadata for parquet statistic loading (#18343)
- Fix accidental quadratic parquet metadata (#18327)
- Lazy decompress Parquet pages (#18326)
- Don't rechunk aligned chunks in owned_binary_chunk_align (#18314)
- Batch
DELTA_LENGTH_BYTE_ARRAYdecoding (#18299) - Slice pushdown for SimpleProjection (#18296)
- Use direct path for
time/timedeltaliterals (#18223) - Speedup ndjson reader
~40%(#18197) - Skip parquet page when unneeded (#18192)
β¨ Enhancements
- Use Altair in DataFrame.plot (#17995)
- Allow mapping as syntactic sugar in
str.replace_many(#18214) - Respect input time zone if input is pandas Timestamp (#18346)
- Improve Schema and DataType interop with Python types (#18308)
- Add POLARS_BACKTRACE_IN_ERR for debugging (#18333)
- IR serde (#18298)
- Improve decimal_comma error message (#18269)
- Support pre-signed URLs for cloud scan (#18274)
- Support the most recent version of "duckdb_engine" connections via
read_database(#18277) - Support empty structs (#18249)
- Allow float in interpolate_by by column (#18015)
- Make show_versions more responsive (#18208)
π Bug fixes
- Enable CSE in eager if struct are expanded (#18426)
- Treat
explodeasgather(#18431) - Parquet nested values that span several pages (#18407)
- Support reading empty parquet files (#18392)
- Recurse on map field during type conversion (#15075)
- Allow search_sorted on boolean series (#18387)
- Mark Expr.(lower|upper)_bound as returning scalar (#18383)
- Fix compressed ndjson row count (#18371)
- Use correct column names when there are no value columns in unpivot (#18340)
- Parquet several smaller issues (#18325)
- Fix group-by slice on all keys (#18324)
- Compute joint null mask before calling rolling corr/cov stats (#18246)
- Several
scan_parquet(parallel='prefiltered')problems (#18278) - Json feature flag missing imports (#18305)
- Check groups in group-by filter (#18300)
- Parquet delta encoding for 0-bitwidth miniblocks (#18289)
- Arguments for
upsampleonly have to be sorted within groups (#18264) - Use appropriate bins in
histwhenbin_countspecified (#16942) - Raise suitable error on unsupported
SQLset op syntax (#18205) - Fix invalid state due to cached IR (#18262)
- Fix failed AWS credential load from '~/.aws/credentials' due to formatting (#18259)
- Fix panic streaming parquet scan from cloud with slice (#18202)
- Consistently round half-way points down in dt.round (#18245)
- Fix duplicate column output and panic for
include_file_paths(#18255) - Fix unit null rank (#18252)
- Use physical for row-encoding (#18251)
- Convert date and datetime in literal construction (#16018)
- Fix gather str as lit (#18207)
π Documentation
- Add date_range and datetime_ranges examples without
eager=True(#18379) - Fix incorrect comments in
group_by_dynamic(#18415) - Alphabetise methods in Python API reference (#18380)
- Document POLARS_BACKTRACE_IN_ERR env var (#18354)
- Add missing aggregation entries (#18334) (#18341)
- Add missing
Seriesmethods to API reference (#18312) - Document
DataFrame.__getitem__andSeries.__getitem__(#18309) - Fix typos and add see also links to struct name expressions (#18282)
- Improve decimal_comma error message (#18269)
- Clarify
coalescebehaviour injoin_asof(#18273) - Add note to
Expr.shuffledifferentiating from df method (#18266) - Improve formatting and consistency of various docstrings (#18237)
- Add missing "Parameters" section to
bin.sizeexpr docstring (#18222) - Fix column name output in example of
DataFrame.map_rows(#18227)
π¦ Build system
- Bump Rust toolchain to
nightly-2024-08-26(#18370)
π οΈ Other improvements
- Address spurious hypothesis test failure (#18434)
- Turn all Binary/Utf8 into BinaryView/Utf8View in Parquet (#18331)
- Fix the required version of rust in README.md (#18357)
- Remove unused Parquet indexes (#18329)
- Deprecate serialize json for LazyFrame (#18283)
- Don't add sink node to cloud query (#18280)
- Split
py-polarscrate (#18204) - Fix test for new deltalake release (#18211)
- Update the required version of rust in README.md (#18203)
- Fix version bifurcation for
test_read_database_cx_credentials(#18220) - Use or_else for raising (#18206)
- Remove unused Parquet source files (#18193)
Thank you to all our contributors for making this release possible! @BartSchuurmans, @ChayimFriedman2, @MarcoGorelli, @StepfenShawn, @agossard, @alexander-beedie, @cgbur, @coastalwhite, @corwinjoy, @deanm0000, @henryharbeck, @ion-elgreco, @jqnatividad, @krasnobaev, @liufeimath, @markxwang, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stinodego, @sunadase, @thomascamminady and @wence-
Files
pola-rs/polars-py-1.6.0.zip
Files
(5.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:a0eeb5dc3b740ae8ca0d1cb0192faf26
|
5.0 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/pola-rs/polars/tree/py-1.6.0 (URL)