There is a newer version of the record available.

Published November 15, 2020 | Version v0.22.0
Software Open

JuliaData/DataFrames.jl: v0.22.0

  • 1. Facebook
  • 2. SGH Warsaw School of Economics
  • 3. Franch Institute for Demographic Studies (Ined)
  • 4. Google
  • 5. Technical University of Munich
  • 6. University of Wisconsin
  • 7. Electric Power Research Institute
  • 8. SecondSpectrum
  • 9. Beacon Biosignals
  • 10. University of California, Berkeley
  • 11. Rutgers University
  • 12. @JuliaComputing
  • 13. Julia Computing
  • 14. @JuliaComputing / @NYU-MSDSE-SWG
  • 15. The Chinese University of Hong Kong
  • 16. Sciences Po
  • 17. @invenia Labs

Description

DataFrames v0.22.0

Diff since v0.21.8

DataFrames v0.22 Release Notes Breaking changes
  • the rules for transformations passed to select/select!, transform/transform!, and combine have been made more flexible; in particular now it is allowed to return multiple columns from a transformation function (#2461 and #2481)
  • CategoricalArrays.jl is no longer reexported: call using CategoricalArrays to use it #2404). In the same vein, the categorical and categorical! functions have been deprecated in favor of transform(df, cols .=> categorical .=> cols) and similar syntaxes #2394). stack now creates a PooledVector{String} variable column rather than a CategoricalVector{String} column by default; pass variable_eltype=CategoricalValue{String} to get the previous behavior (#2391)
  • isless for DataFrameRows now checks column names (#2292)
  • DataFrameColumns is now not a subtype of AbstractVector (#2291)
  • nunique is not reported now by describe by default (#2339)
  • stop reordering columns of the parent in transform and transform!; always generate columns that were specified to be computed even for GroupedDataFrame with zero rows (#2324)
  • improve the rule for automatically generated column names in combine/select(!)/transform(!) with composed functions (#2274)
  • :nmissing in describe now produces 0 if the column does not allow missing values; earlier nothing was produced in this case (#2360)
  • fast aggregation functions in for GroupedDataFrame now correctly choose the fast path only when it is safe; this resolves inconsistencies with what the same functions not using fast path produce (#2357)
  • joins now return PooledVector not CategoricalVector in indicator column (#2505)
  • GroupKeys now supports in for GroupKey, Tuple, NamedTuple and dictionaries (2392)
  • in describe the specification of custom aggregation is now function => name; old name => function order is now deprecated (#2401)
  • in joins passing NaN or real or imaginary -0.0 in on column now throws an error; passing missing thows an error unless matchmissing=:equal keyword argument is passed (#2504)
  • unstack now produces row and column keys in the order of their first appearance and has two new keyword arguments allowmissing and allowduplicates (#2494)
  • PrettyTables.jl is now the default back-end to print DataFrames to text/plain; the print option splitcols was removed and the output format was changed (#2429)
New functionalities
  • add filter to GroupedDataFrame (#2279)
  • add empty and empty! function for DataFrame that remove all rows from it, but keep columns (#2262)
  • make indicator keyword argument in joins allow passing a string (#2284, #2296)
  • add new functions to GroupKey API to make it more consistent with DataFrameRow (#2308)
  • allow column renaming in joins (#2313 and (#2398)
  • add rownumber to DataFrameRow (#2356)
  • allow passing column name to specify the position where a new columns should be inserted in insertcols! (#2365)
  • allow GroupedDataFrames to be indexed using a dictionary, which can use Symbol or string keys and are not dependent on the order of keys. (#2281)
  • add isapprox method to check for approximate equality between two dataframes (#2373)
  • add columnindex for DataFrameRow (#2380)
  • names now accepts Type as a column selector (#2400)
  • select, select!, transform, transform! and combine now allow renamecols keyword argument that makes it possible to avoid adding transformation function name as a suffix in automatically generated column names (#2397)
  • filter, sort, dropmissing, and unique now support a view keyword argument which if set to true makes them retun a SubDataFrame view into the passed data frame.
  • add only method for AbstractDataFrame (#2449)
  • passing empty sets of columns in filter/filter! and in select/transform/combine with ByRow is now accepted (#2476)
  • add permutedims method for AbstractDataFrame (#2447)
  • add support for Cols from DataAPI.jl (#2495)
Deprecated
  • DataFrame! is now deprecated (#2338)
  • several in-standard DataFrame constructors are now deprecated (#2464)
  • all old deprecations now throw an error (#2350)
Dependency changes
  • Tables.jl version 1.2 is now required.
  • DataAPI.jl version 1.4 is now required. It implies that All(args...) is deprecated and Cols(args...) is recommended instead. All() is still supported.
Other relevant changes
  • Documentation is now available also in Dark mode (#2315)
  • add rich display support for Markdown cell entries in HTML and LaTeX (#2346)
  • limit the maximal display width the output can use in text/plain before being truncated (in the textwidth sense, excluding ) to 32 per column by default and fix a corner case when no columns are printed in situations when they are too wide (#2403)
  • Common methods are now precompiled to improve responsiveness the first time a method is called in a Julia session. Precompilation takes up to 30 seconds after installing the package (#2456).

Closed issues:

  • Allow to hide row numbers (#592)
  • Stop printing row numbers in show(io, df)? (#864)
  • Show a (kind of) transposed DataFrame (#2065)
  • Improve text/plain show for AbstractDataFrame (#2146)
  • Showing of very wide data frames (#2302)
  • Add PrettyTables.jl as an alternative backend for display in DataFrames.jl (#2337)
  • add transpose(df, src_namescol, dst_namescol) (#2420)
  • Deprecate DataFrame(::AbstractMatrix) (#2433)
  • Always use ? for Union{T, Missing} (#2480)
  • Stop supporting broadcasting + against whole DataFrames (#2483)
  • clean-up unstack (#2485)
  • Join on index with compatible Unitful types (#2486)
  • ERROR: UndefVarError: ByRow not defined (#2493)
  • Explicitly handling missingness in join columns (#2499)
  • sort with by accepts tuples still (#2500)
  • innerjoin not working if one df is a SubDataFrame or item of GroupedDataFrame (#2502)
  • remaining dependencies on CategoricalArrays (#2506)
  • Immutable DataFrames (#2507)
  • general principles of data manipulation for dicussion (#2509)
  • create maprow to be complementary with mapcol (#2510)
  • insertcols!(df, values => :name ) (#2512)
  • [Feature request] Support for converting single-column dataframes to Vectors (#2526)
  • Sync tests with Tables 1.2 (#2529)
  • select does not have method to handle Pair? (#2531)
  • Warning: getindex(df::DataFrame, col_ind::ColumnIndex) is deprecated (#2532)
  • ERROR: The following package names could not be resolved: (#2534)

Merged pull requests:

  • remove dependency on CategoricalArrays.jl in legacy show (#2427) (@bkamins)
  • [BREAKING] Add PrettyTables.jl backend for printing DataFrames (#2429) (@ronisbr)
  • Implement permutedims (#2447) (@kescobo)
  • Enable precompilation (#2456) (@nalimilan)
  • [BREAKING] deprecate DataFrame constructors (#2464) (@bkamins)
  • [BREAKING] Multicolumn transformations for GoupedDataFrame (#2481) (@bkamins)
  • [BREAKING] Refactor unstack (#2494) (@bkamins)
  • add Cols support (#2495) (@bkamins)
  • avoid allocation when negating BitArray (#2497) (@OkonSamuel)
  • make sure by isa Function or a vector of functions (#2501) (@bkamins)
  • Remove type parameters in DataFrameJoiner (#2503) (@bkamins)
  • [BREAKING] add matchmissing kwarg to joins (#2504) (@bkamins)
  • [BREAKING] remove CategoricalArrays dependency from joins (#2505) (@bkamins)
  • fix deprecated tests in reshape (#2511) (@bkamins)
  • require DataAPI.jl version 1.4 (#2514) (@bkamins)
  • move All(args...) tests to deprecated.jl (#2515) (@bkamins)
  • make hashrows_col! not depend on CategoricalArrays.jl (#2518) (@bkamins)
  • avoid CategoricalArrays dependency in aggregates (#2519) (@bkamins)
  • Switch from Coveralls to Codecov (#2520) (@nalimilan)
  • Allow CategoricalArrays 0.9 (#2521) (@nalimilan)
  • update manual and docstrings to PrettyTables.jl (#2522) (@bkamins)
  • Update Categorical test (#2523) (@bkamins)
  • fix coverage badge (#2524) (@pdeffebach)
  • Update TagBot.yml (#2527) (@quinnj)
  • Update tests to Tables.jl v1.2 (#2530) (@bkamins)
  • Add StatsKit to the ecosystem section (#2535) (@nalimilan)
  • code layout improvements (#2536) (@bkamins)
  • Improve floating point alignment (#2537) (@ronisbr)
  • update deprecated tests (#2538) (@bkamins)

Files

JuliaData/DataFrames.jl-v0.22.0.zip

Files (393.3 kB)

Name Size Download all
md5:8c54d56fc2a4118f9587d4a9ff8a33c6
393.3 kB Preview Download

Additional details