There is a newer version of the record available.

Published February 22, 2017 | Version 0.3.3
Software Open

Yellowbrick v0.3.3

  • 1. District Data Labs

Description

Yellowbrick is an open source, pure Python project that extends the scikit-learn API with visual analysis and diagnostic tools. The Yellowbrick API also wraps matplotlib to create publication-ready figures and interactive data explorations while still allowing developers fine-grain control of figures. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models and assist in diagnosing problems throughout the machine learning workflow.

Changes 

Intermediate sprint to demonstrate prototype implementations of text visualizers for NLP models. Primary contributions were the FreqDistVisualizer and the TSNEVisualizer.

The TSNEVisualizer displays a projection of a vectorized corpus in two dimensions using TSNE, a nonlinear dimensionality reduction method that is particularly well suited to embedding in two or three dimensions for visualization as a scatter plot. TSNE is widely used in text analysis to show clusters or groups of documents or utterances and their relative proximities.

The FreqDistVisualizer implements frequency distribution plot that tells us the frequency of each vocabulary item in the text. In general, it could count any kind of observable event. It is a distribution because it tells us how the total number of word tokens in the text are distributed across the vocabulary items.

  • TSNEVisualizer for 2D projections of vectorized documents
  • FreqDistVisualizer for token frequency of text in a corpus
  • Added the user testing evaluation to the documentation
  • Created scikit-yb.org and host documentation there with RFD
  • Created a sample corpus and text examples notebook
  • Created a base class for text, TextVisualizer
  • Model selection tutorial using Mushroom Dataset
  • Created a text examples notebook but have not added to documentation.

Files

yellowbrick-0.3.3.zip

Files (6.1 MB)

Name Size Download all
md5:dcecb256e5883b3ba44e9d88158cbc4a
6.1 MB Preview Download

Additional details