Yellowbrick v1.0
Authors/Creators
-
Bengfort, Benjamin
-
Bilbro, Rebecca
- Danielsen, Nathan
- McIntyre, Kristen
- Gray, Larry
- Roman, Prema
- Naresh Bachwani
- Carl Dawson
- Daniel Navarrete
- Francois Dion
- Halee Mason
- Jeff Hale
- Jiayi Zhang
- Jimmy Shah
- John Healy
- Justin Ormont
- Kevin Arvai
- Michael Garod
- Mike Curry
- Nabanita Dash
- Nicholas A. Brown
- Piyush Gautam
- Pradeep Singh
- Rohit Ganapathy
- Ry Whittington
- Sangarshanan, Sourav Singh
- Thomas J Fan
- Zijie (ZJ) Poh
- Zonghan, Xie
Description
Yellowbrick is an open source, pure Python project that extends the scikit-learn API with visual analysis and diagnostic tools. The Yellowbrick API also wraps matplotlib to create publication-ready figures and interactive data explorations while still allowing developers fine-grain control of figures. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models and assist in diagnosing problems throughout the machine learning workflow.
Note: Python 2 Deprecation: Please note that this release deprecates Yellowbrick's support for Python 2.7. After careful consideration and following the lead of our primary dependencies (NumPy, scikit-learn, and Matplolib), we have chosen to move forward with the community and support Python 3.4 and later.
Major Changes:
- New
JointPlotvisualizer that is specifically designed for machine learning. The new visualizer can compare a feature to a target, features to features, and even feature to feature to target using color. The visualizer gives correlation information at a glance and is designed to work on ML datasets. - New
PosTagVisualizeris specifically designed for diagnostics around natural language processing and grammar-based feature extraction for machine learning. This new visualizer shows counts of different parts-of-speech throughout a tagged corpus. - New datasets module that provide greater support for interacting with Yellowbrick example datasets including support for Pandas, npz, and text corpora.
- Management repository for Yellowbrick example data,
yellowbrick-datasets. - Add support for matplotlib 3.0.1 or greater.
UMAPVisualizeras an alternative manifold to TSNE for corpus visualization that is fast enough to not require preprocessing PCA or SVD decomposition and preserves higher order similarities and distances.- Added
..plot::directives to the documentation to automatically build the images along with the docs and keep them as up to date as possible. The directives also include the source code making it much simpler to recreate examples. - Added
target_color_typefunctionality to determine continuous or discrete color representations based on the type of the target variable. - Added alpha param for both test and train residual points in
ResidualsPlot. - Added
frameonparam toManifold. - Added frequency sort feature to
PosTagVisualizer. - Added elbow detection using the "kneedle" method to the
KElbowVisualizer. - Added governance document outlining new Yellowbrick structure.
- Added
CooksDistanceregression visualizer. - Updated
DataVisualizerto handle target type identification. - Extended
DataVisualizerand updated its subclasses. - Added
ProjectionVisualizerbase class. - Restructured
yellowbrick.target,yellowbrick.features, andyellowbrick.model_selectionAPI. - Restructured regressor and classifier API.
Minor Changes:
- Updated
Rank2Dto include Kendall-Tau metric. - Added user specification of ISO F1 values to
PrecisionRecallCurveand updated the quick method to accept train and test splits. - Added code review checklist and conventions to the documentation and expanded the contributing docs to include other tricks and tips.
- Added polish to missing value visualizers code, tests, and documentation.
- Improved
RankDtests for better coverage. - Added quick method test for
DispersionPlotvisualizer. - BugFix: fixed resolve colors bug in TSNE and UMAP text visualizers and added regression tests to prevent future errors.
- BugFix: Added support for Yellowbrick palettes to return
colormap. - BugFix: fixed
PrecisionRecallCurvevisual display problem with multi-class labels. - BugFix: fixed the
RFECVstep display bug. - BugFix: fixed error in distortion score calculation.
- Extended
FeatureImportancesdocumentation and tests for stacked importances and added a warning when stack should be true. - Improved the documentation readability and structure.
- Refreshed the
README.mdand added testing and documentation READMEs. - Updated the gallery to generate thumbnail-quality images.
- Updated the example notebooks and created a quickstart notebook.
- Fixed broken links in the documentation.
- Enhanced the
SilhouetteVisualizerwithlegendandcolorparameter, while also move labels to the y-axis. - Extended
FeatureImportancesdocs/tests for stacked importances. - Documented the
yellowbrick.downloadscript. - Added JOSS citation for "Yellowbrick: Visualizing the Scikit-Learn Model Selection Process".
- Added new pull request (PR) template.
- Added
alphaparam to PCA Decomposition Visualizer. - Updated documentation with affiliations.
- Added a
windows_tolfor the visual unittest suite. - Added stacked barchart to
PosTagVisualizer. - Let users set colors for
FreqDistVisualizerand otherax_barvisualizers. - Updated
Manifoldto extendProjectionVisualizer. - Check if an estimator is already fitted before calling
fitmethod. - Ensure
poofreturnsax.
Compatibility Notes:
- This version provides support for matplotlib 3.0.1 or greater and drops support for matplotlib versions less than 2.0.
- This version drops support for Python 2
Files
yellowbrick-v1.0.zip
Files
(29.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:a64f8e2e55f5d6b14941d71becb2b9b3
|
29.7 MB | Preview Download |
Additional details
Related works
- Is documented by
- http://www.scikit-yb.org/en/stable/ (URL)
- Is supplemented by
- https://github.com/DistrictDataLabs/yellowbrick/releases/tag/v0.6 (URL)