yellowbrick package

Submodules

yellowbrick.anscombe module

Plots Anscombe’s Quartet as an illustration of the importance of visualization.

yellowbrick.anscombe.anscombe()[source]

Creates 2x2 grid plot of the 4 anscombe datasets for illustration.

yellowbrick.base module

Abstract base classes and interface for Yellowbrick.

class yellowbrick.base.ModelVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.Visualizer, yellowbrick.utils.wrapper.Wrapper

The ModelVisualizer class wraps a Scikit-Learn estimator (usually a predictive model like a regressor, classifier, or clusterer) so that all functionality that belongs to the estimator can be accessed from the visualizer, thereby allowing visualzers to be proxies for model objects, simply drawing on behalf of the wrapped model.

Parameters:

model : Estimator

A Scikit-Learn estimator to wrap functionality for, usually regressor, classifier, or clusterer predictive model.

ax : matplotlib Axes, default: None

The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required).

kwargs : dict

Keyword arguments that are passed to the base class and may influence the visualization as defined in other Visualizersself.

Notes

Model visualizers can wrap either fitted or unfitted models.

fit(X, y=None, **kwargs)[source]

Fits the wrapped estimator so that subclasses that override fit can ensure that the estimator is fit using super rather than a direct call down to the estimator. Score estimators tend to expect a fitted model.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: dict

Keyword arguments passed to the drawing functionality or to the Scikit-Learn API. See visualizer specific details for how to use the kwargs to modify the visualization or fitting process.

Returns:

self : visualizer

The fit method must always return self to support pipelines.

class yellowbrick.base.MultiModelMixin(models, ax=None, **kwargs)[source]

Bases: object

Does predict for each of the models and generates subplots.

generate_subplots()[source]

Generates the subplots for the number of given models.

predict(X, y)[source]

Returns a generator containing the predictions for each of the internal models (using cross_val_predict and a CV=12).

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: dict

keyword arguments passed to Scikit-Learn API.

class yellowbrick.base.ScoreVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.ModelVisualizer

The ScoreVisualizer reports the performance of a Scikit-Learn estimator (usually a predictive model like a regressor, classifier, or clusterer) in a visual manner. They hook into the Scikit-Learn pipeline through the score(X_test, y_test) method, reporting not just a single numeric score, but also a visual report of the score in model space.

Parameters:

model : Estimator

A Scikit-Learn estimator to wrap functionality for, usually regressor, classifier, or clusterer predictive model.

ax : matplotlib Axes, default: None

The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required).

kwargs : dict

Keyword arguments that are passed to the base class and may influence the visualization as defined in other Visualizersself.

Notes

Score visualizers can wrap either fitted or unfitted models.

score(X, y, **kwargs)[source]

The primary entry point for score visualizers is the score method, which makes predictions based on X and scores them relative to y.

class yellowbrick.base.Visualizer(ax=None, **kwargs)[source]

Bases: sklearn.base.BaseEstimator

The root of the visual object hierarchy that defines how yellowbrick creates, stores, and renders visual artifacts using matplotlib.

Inherits from Scikit-Learn’s BaseEstimator class.

The base class for feature visualization and model visualization primarily ensures that styling arguments are passed in.

Parameters:

ax : matplotlib Axes, default: None

The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required).

kwargs : dict

Keyword arguments that are passed to the base class and may influence the visualization as defined in other Visualizers. Optional keyword arguments include:

Property Description
size specify a size for the figure (currently unimplemented)
color specify a color, colormap, or palette for the figure
title specify the title of the figure

Notes

Visualizers are objects that learn from data (e.g. estimators), therefore they must be fit() before they can be drawn or used. Visualizers also maintain a reference to an ax object, a matplotlib Axes where the figures are drawn and rendered.

ax

The matplotlib axes that the visualizer draws upon (can also be a grid of multiple axes objects). The visualizer automatically creates an axes for the user if one has not been specified.

draw(**kwargs)[source]

The fitting or transformation process usually calls draw (not the user). This function is implemented for developers to hook into the matplotlib interface and to create an internal representation of the data the visualizer was trained on in the form of a figure or axes.

Parameters:

kwargs: dict

generic keyword arguments.

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps.

Parameters:

kwargs: dict

generic keyword arguments.

Notes

The user calls poof and poof calls finalize. Developers should implement visualizer-specific finalization methods like setting titles or axes labels, etc.

fit(X, y=None, **kwargs)[source]

Fits a visualizer to data and is the primary entry point for producing a visualization. Visualizers are Scikit-Learn Estimator objects, which learn from data in order to produce a visual analysis or diagnostic. They can do this either by fitting features related data or by fitting an underlying model (or models) and visualizing their results.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: dict

Keyword arguments passed to the drawing functionality or to the Scikit-Learn API. See visualizer specific details for how to use the kwargs to modify the visualization or fitting process.

Returns:

self : visualizer

The fit method must always return self to support pipelines.

poof(outpath=None, **kwargs)[source]

Poof makes the magic happen and a visualizer appear! You can pass in a path to save the figure to disk with various backends, or you can call it with no arguments to show the figure either in a notebook or in a GUI window that pops up on screen.

Parameters:

outpath: string, default: None

path or None. Save figure to disk or if None show in window

kwargs: dict

generic keyword arguments.

Notes

Developers of visualizers don’t usually override poof, as it is primarily called by the user to render the visualization.

set_title(title=None)[source]

Sets the title on the current axes.

Parameters:

title: string, default: None

Add title to figure or if None leave untitled.

yellowbrick.bestfit module

Uses Scikit-Learn to compute a best fit function, then draws it in the plot.

yellowbrick.bestfit.draw_best_fit(X, y, ax, estimator=’linear’, **kwargs)[source]

Uses Scikit-Learn to fit a model to X and y then uses the resulting model to predict the curve based on the X values. This curve is drawn to the ax (matplotlib axis) which must be passed as the third variable.

The estimator function can be one of the following:

‘linear’: Uses OLS to fit the regression ‘quadratic’: Uses OLS with Polynomial order 2 ‘exponential’: Not implemented yet ‘log’: Not implemented yet ‘select_best’: Selects the best fit via MSE

The remaining keyword arguments are passed to ax.plot to define and describe the line of best fit.

yellowbrick.bestfit.fit_exponential(X, y)[source]

Fits an exponential curve to the data.

yellowbrick.bestfit.fit_linear(X, y)[source]

Uses OLS to fit the regression.

yellowbrick.bestfit.fit_log(X, y)[source]

Fit a logrithmic curve to the data.

yellowbrick.bestfit.fit_quadratic(X, y)[source]

Uses OLS with Polynomial order 2.

yellowbrick.bestfit.fit_select_best(X, y)[source]

Selects the best fit of the estimators already implemented by choosing the model with the smallest mean square error metric for the trained values.

yellowbrick.exceptions module

Exceptions hierarchy for the yellowbrick library

exception yellowbrick.exceptions.ModelError[source]

Bases: yellowbrick.exceptions.YellowbrickError

A problem when interacting with sklearn or the ML framework.

exception yellowbrick.exceptions.VisualError[source]

Bases: yellowbrick.exceptions.YellowbrickError

A problem when interacting with matplotlib or the display framework.

exception yellowbrick.exceptions.YellowbrickError[source]

Bases: exceptions.Exception

The root exception for all yellowbrick related errors.

exception yellowbrick.exceptions.YellowbrickTypeError[source]

Bases: yellowbrick.exceptions.YellowbrickError, exceptions.TypeError

There was an unexpected type or none for a property or input.

exception yellowbrick.exceptions.YellowbrickValueError[source]

Bases: yellowbrick.exceptions.YellowbrickError, exceptions.ValueError

A bad value was passed into a function.

yellowbrick.pipeline module

Implements a visual pipeline that subclasses Scikit-Learn pipelines.

class yellowbrick.pipeline.VisualPipeline(steps)[source]

Bases: sklearn.pipeline.Pipeline

Pipeline of transforms and visualizers with a final estimator.

Sequentially apply a list of transforms, visualizers, and a final estimator which may be evaluated by additional visualizers. Intermediate steps of the pipeline must be ‘transforms’, that is, they must implement fit and transform methods. The final estimator only needs to implement fit.

Any step that implements draw or poof methods can be called sequentially directly from the VisualPipeline, allowing multiple visual diagnostics to be generated, displayed, and saved on demand. If draw or poof is not called, the visual pipeline should be equivalent to the simple pipeline to ensure no reduction in performance.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. These steps can be visually diagnosed by visualizers at every point in the pipeline.

Parameters:

steps : list

List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator. Any intermediate step can be a FeatureVisualizer and the last step can be a ScoreVisualizer.

Attributes

named_steps (dict) Read-only attribute to access any step parameter by user given name. Keys are step names and values are step parameters.
visual_steps (dict) Read-only attribute to access any visualizer in he pipeline by user given name. Keys are step names and values are visualizer steps.
fit_transform_poof(X, y=None, outpath=None, **kwargs)[source]

Fit the model and transforms and then call poof.

poof(outdir=None, ext=’.pdf’, **kwargs)[source]

A single entry point to rendering all visualizations in the visual pipeline. The rendering for the output depends on the backend context, but for path based renderings (e.g. saving to a file), specify a directory and extension to compse an outpath to save each visualization (file names will be based on the named step).

Parameters:

outdir : path

The directory to save visualizations to.

ext : string, default = “.pdf”

The extension of the file to save the visualization to.

kwargs : dict

Keyword arguments to pass to the poof() method of all steps.

visual_steps

yellowbrick.regressor module

Visualizers for Regression analysis and diagnostics, particularly visualizations related to evaluating Scikit-Learn regressor models.

yellowbrick.utils module

Utility functions and helpers for the Yellowbrick library.

yellowbrick.version module

Maintains version and package information for deployment.

yellowbrick.version.get_version(short=False)[source]

Prints the version.

Module contents

A suite of visual analysis and diagnostic tools to facilitate feature selection, model selection, and parameter tuning for machine learning.