mwaskom/seaborn: v0.13.0 (September 2023)
Creators
- Michael Waskom
- Maoz Gelbart
- Olga Botvinnik1
- Joel Ostblom2
- Paul Hobson3
- Saulius Lukauskas
- David C Gemperline
- Tom Augspurger4
- Yaroslav Halchenko5
- Jordi Warmenhoven
- John B. Cole
- Ewout ter Hoeven
- Julian de Ruiter6
- Jake Vanderplas7
- Stephan Hoyer8
- Cameron Pye9
- Alistair Miles10
- Corban Swain
- Kyle Meyer
- Marcel Martin
- Pete Bachant11
- Stefanie Molin12
- Eric Quintero
- Gero Kunter13
- Santi Villalba
- Brian
- Clark Fitzgerald14
- Constantine Evans
- Mike Lee Williams
- 1. @BridgeBioAnalytics
- 2. University of British Columbia
- 3. @HerreraEnvironmental
- 4. @microsoft
- 5. Dartmouth College, @Debian, @DataLad, @PyMVPA, @fail2ban
- 6. GoDataDriven
- 7. Google
- 8. @google
- 9. Unnatural Products Inc.
- 10. Wellcome Sanger Institute
- 11. @WindESCo
- 12. @Bloomberg
- 13. Universität Siegen
- 14. Math and Stats Department
Description
See the online docs for an annotated version of these notes with working links.
This is a major release with a number of important new features and changes. The highlight is a major overhaul to seaborn's categorical plotting functions, providing them with many new capabilities and better aligning their API with the rest of the library. There is also provisional support for alternate dataframe libraries like polars, a new theme and display configuration system for objects.Plot
, and many smaller bugfixes and enhancements.
Updating is recommended, but users are encouraged to carefully check the outputs of existing code that uses the categorical functions, and they should be aware of some deprecations and intentional changes to the default appearance of the resulting plots (see notes below with and tags).
Major enhancements to categorical plotsSeaborn's categorical functions <categorical_api>
have been completely rewritten for this release. This provided the opportunity to address some longstanding quirks as well as to add a number of smaller but much-desired features and enhancements.
The categorical functions have historically treated all data as categorical, even when it has a numeric or datetime type. This can now be controlled with the new <span class="title-ref">native_scale</span> parameter. The default remains <span class="title-ref">False</span> to preserve existing behavior. But with <span class="title-ref">native_scale=True</span>, values will be treated as they would by other seaborn or matplotlib functions. Element widths will be derived from the minimum distance between two unique values on the categorical axis.
Additionally, while seaborn previously determined the mapping from categorical values to ordinal positions internally, this is now delegated to matplotlib. The change should mostly be transparent to the user, but categorical plots (even with <span class="title-ref">native_scale=False</span>) will better align with artists added by other seaborn or matplotlib functions in most cases, and matplotlib's interactive machinery will work better.
Changes to color defaults and specificationThe categorical functions now act more like the rest of seaborn in that they will produce a plot with a single main color unless the <span class="title-ref">hue</span> variable is assigned. Previously, there would be an implicit redundant color mapping (e.g., each box in a boxplot would get a separate color from the default palette). To retain the previous behavior, explicitly assign a redundant <span class="title-ref">hue</span> variable (e.g., <span class="title-ref">boxplot(data, x="x", y="y", hue="x")</span>).
Two related idiosyncratic color specifications are deprecated, but they will continue to work (with a warning) for one release cycle:
- Passing a <span class="title-ref">palette</span> without explicitly assigning <span class="title-ref">hue</span> is no longer supported (add an explicitly redundant <span class="title-ref">hue</span> assignment instead).
- Passing a <span class="title-ref">color</span> while assigning <span class="title-ref">hue</span> to produce a gradient is no longer supported (use <span class="title-ref">palette="dark:{color}"</span> or <span class="title-ref">palette="light:{color}"</span> instead).
Finally, like other seaborn functions, the default palette now depends on the variable type, and a sequential palette will be used with numeric data. To retain the previous behavior, pass the name of a qualitative palette (e.g., <span class="title-ref">palette="deep"</span> for seaborn's default). Accordingly, the functions have gained a parameter to control numeric color mappings (<span class="title-ref">hue_norm</span>).
Other features, enhancements, and changesThe following updates apply to multiple categorical functions.
- All functions now accept a <span class="title-ref">legend</span> parameter, which can be a boolean (to suppress the legend) or one of <span class="title-ref">{"auto", "brief", "full"}</span> to control the amount of information shown in the legend for a numerical color mapping.
- All functions now accept a callable <span class="title-ref">formatter</span> parameter to control the string representation of the data.
- All functions that draw a solid patch now accept a boolean <span class="title-ref">fill</span> parameter, which when set to <span class="title-ref">False</span> will draw line-art elements.
- All functions that support dodging now have an additional <span class="title-ref">gap</span> parameter that can be set to a non-zero value to leave space between dodged elements.
- The
boxplot
,boxenplot
, andviolinplot
functions now support a single <span class="title-ref">linecolor</span> parameter. - The default value for <span class="title-ref">dodge</span> has changed from <span class="title-ref">True</span> to <span class="title-ref">"auto"</span>. With <span class="title-ref">"auto"</span>, elements will dodge only when at least one set of elements would otherwise overlap.
- When the value axis of the plot has a non-linear scale, the statistical operations (e.g. an aggregation in
pointplot
or the kernel density fit inviolinplot
) are now applied in that scale space. - All functions now accept a <span class="title-ref">log_scale</span> parameter. With a single argument, this will set the scale on the "value" axis (opposite the categorical axis). A tuple will set each axis directly (although setting a log scale categorical axis also requires <span class="title-ref">native_scale=True</span>).
- The <span class="title-ref">orient</span> parameter now accepts <span class="title-ref">"x"/"y"</span> to specify the categorical axis, matching the objects interface.
- The categorical functions are generally more deferential to the user's additional matplotlib keyword arguments.
- Using <span class="title-ref">"gray"</span> to select an automatic gray value that complements the main palette is now deprecated in favor of <span class="title-ref">"auto"</span>.
The following updates are function-specific.
- In
pointplot
, a singlematplotlib.lines.Line2D
artist is now used rather than adding separatematplotlib.collections.PathCollection
artist for the points. As a result, it is now possible to pass additional keyword arguments for complete customization the appearance of both the lines and markers; additionally, the legend representation is improved. Accordingly, parameters that previously allowed only partial customization (<span class="title-ref">scale</span>, <span class="title-ref">join</span>, and <span class="title-ref">errwidth</span>) are now deprecated. The old parameters will now trigger detailed warning messages with instructions for adapting existing code. - The bandwidth specification in
violinplot
better aligns withkdeplot
, as the <span class="title-ref">bw</span> parameter is now deprecated in favor of <span class="title-ref">bw_method</span> and <span class="title-ref">bw_adjust</span>. - In
boxenplot
, the boxen are now drawn with separate patch artists in each tail. This may have consequences for code that works with the underlying artists, but it produces a better result for low-alpha / unfilled plots and enables proper area/density scaling. - In
barplot
, the <span class="title-ref">errcolor</span> and <span class="title-ref">errwidth</span> parameters are now deprecated in favor of a more general <span class="title-ref">err_kws</span>` dictionary. The existing parameters will continue to work for two releases. - In
violinplot
, the <span class="title-ref">scale</span> and <span class="title-ref">scale_hue</span> parameters have been renamed to <span class="title-ref">density_norm</span> and <span class="title-ref">common_norm</span> for clarity and to reflect the fact that common normalization is now applied over both hue and faceting variables incatplot
. - In
boxenplot
, the <span class="title-ref">scale</span> parameter has been renamed to <span class="title-ref">width_method</span> as part of a broader effort to de-confound the meaning of "scale" in seaborn parameters. - When passing a vector to the <span class="title-ref">data</span> parameter of
barplot
orpointplot
, a bar or point will be drawn for each entry in the vector rather than plotting a single aggregated value. To retain the previous behavior, assign the vector to the <span class="title-ref">y</span> variable. - In
boxplot
, the default flier marker now follows the matplotlib rcparams so that it can be globally customized. - When using <span class="title-ref">split=True</span> and <span class="title-ref">inner="box"</span> in
violinplot
, a separate mini-box is now drawn for each split violin. - In
boxenplot
, all plots now use a consistent luminance ramp for the different box levels. This leads to a change in the appearance of existing plots, but reduces the chances of a misleading result. - The <span class="title-ref">"area"</span> scaling in
boxenplot
now approximates the density of the underlying observations, including for asymmetric distributions. This produces a substantial change in the appearance of plots with <span class="title-ref">width_method="area"</span>, although the existing behavior was poorly defined. - In
countplot
, the new <span class="title-ref">stat</span> parameter can be used to apply a normalization (e.g to show a <span class="title-ref">"percent"</span> or <span class="title-ref">"proportion"</span>). - The <span class="title-ref">split</span> parameter in
violinplot
is now more general and can be set to <span class="title-ref">True</span> regardless of the number of <span class="title-ref">hue</span> variable levels (or even without <span class="title-ref">hue</span>). This is probably most useful for showing half violins. - In
violinplot
, the new <span class="title-ref">inner_kws</span> parameter allows additional control over the interior artists. - It is no longer required to use a <span class="title-ref">DataFrame</span> in
catplot
, as data vectors can now be passed directly. - In
boxplot
, the artists that comprise each box plot are now packaged in a <span class="title-ref">BoxPlotContainer</span> for easier post-plotting access.
- Nearly all functions / objects now use the dataframe exchange protocol to accept <span class="title-ref">DataFrame</span> objects from libraries other than <span class="title-ref">pandas</span> (e.g. <span class="title-ref">polars</span>). Note that seaborn will still convert the data object to pandas internally, but this feature will simplify code for users of other dataframe libraries (
3369
).
- Added control over the default theme to
objects.Plot
(3223
) - Added control over the default notebook display to
objects.Plot
(3225
). - Added the concept of a "layer legend" in
objects.Plot
via the new <span class="title-ref">label</span> parameter inobjects.Plot.add
(3456
). - In
objects.Plot.scale
,objects.Plot.limit
, andobjects.Plot.label
the <span class="title-ref">x</span> / <span class="title-ref">y</span> parameters can be used to set a common scale / limit / label for paired subplots (3458
).
- Improved the legend display for relational and categorical functions to better represent the user's additional keyword arguments (
3467
). - In
ecdfplot
, <span class="title-ref">stat="percent"</span> is now a valid option (3336
). - Data values outside the scale transform domain (e.g. non-positive values with a log scale) are now dropped prior to any statistical operations (
3488
). - In
histplot
, infinite values are now ignored when choosing the default bin range (3488
). - There is now generalized support for performing statistics in the appropriate space based on axes scales; previously support for this was spotty and at best worked only for log scales (
3440
). - Updated
load_dataset
to use an approach more compatible with <span class="title-ref">pyiodide</span> (3234
). - Support for array-typed palettes is now deprecated. This was not previously documented as supported, but it worked by accident in a few places (
3452
). - In
histplot
, treatment of the <span class="title-ref">binwidth</span> parameter has changed such that the actual bin width will be only approximately equal to the requested width when that value does not evenly divide the bin range. This fixes an issue where the largest data value was sometimes dropped due to floating point error (3489
). - Fixed
objects.Bar
andobjects.Bars
widths when using a nonlinear scale (3217
). - Worked around an issue in matplotlib that caused incorrect results in
move_legend
when <span class="title-ref">labels</span> were provided (3454
). - Fixed a bug introduced in v0.12.0 where
histplot
added a stray empty <span class="title-ref">BarContainer</span> (3246
). - Fixed a bug where
objects.Plot.on
would override a figure's layout engine (3216
). - Fixed a bug introduced in v0.12.0 where
lineplot
with a list of tuples for the keyword argument dashes caused a TypeError (3316
). - Fixed a bug in
PairGrid
that caused an exception when the input dataframe had a column multiindex (3407
). - Improved a few edge cases when using pandas nullable dtypes (
3394
).
Notes
Files
mwaskom/seaborn-v0.13.0.zip
Files
(2.0 MB)
Name | Size | Download all |
---|---|---|
md5:974b67953111fc6f3ac34cdb492f5613
|
2.0 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/mwaskom/seaborn/tree/v0.13.0 (URL)