Yellowbrick v0.3.3
Description
Yellowbrick is an open source, pure Python project that extends the scikit-learn API with visual analysis and diagnostic tools. The Yellowbrick API also wraps matplotlib to create publication-ready figures and interactive data explorations while still allowing developers fine-grain control of figures. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models and assist in diagnosing problems throughout the machine learning workflow.
Changes
Intermediate sprint to demonstrate prototype implementations of text visualizers for NLP models. Primary contributions were the FreqDistVisualizer
and the TSNEVisualizer
.
The TSNEVisualizer
displays a projection of a vectorized corpus in two dimensions using TSNE, a nonlinear dimensionality reduction method that is particularly well suited to embedding in two or three dimensions for visualization as a scatter plot. TSNE is widely used in text analysis to show clusters or groups of documents or utterances and their relative proximities.
The FreqDistVisualizer
implements frequency distribution plot that tells us the frequency of each vocabulary item in the text. In general, it could count any kind of observable event. It is a distribution because it tells us how the total number of word tokens in the text are distributed across the vocabulary items.
- TSNEVisualizer for 2D projections of vectorized documents
- FreqDistVisualizer for token frequency of text in a corpus
- Added the user testing evaluation to the documentation
- Created scikit-yb.org and host documentation there with RFD
- Created a sample corpus and text examples notebook
- Created a base class for text,
TextVisualizer
- Model selection tutorial using Mushroom Dataset
- Created a text examples notebook but have not added to documentation.
Files
yellowbrick-0.3.3.zip
Files
(6.1 MB)
Name | Size | Download all |
---|---|---|
md5:dcecb256e5883b3ba44e9d88158cbc4a
|
6.1 MB | Preview Download |
Additional details
Related works
- Is documented by
- http://www.scikit-yb.org/en/stable/ (URL)
- Is supplemented by
- https://github.com/DistrictDataLabs/yellowbrick/releases/tag/v0.3.3 (URL)