DecisionBoundaries Vizualizer

The DecisionBoundariesVisualizer is a bivariate data visualization algorithm that plots the decision boundaries of each class.

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_moons, make_classification

# Create dummy data
X, y = make_classification(n_features=2, n_redundant=0, n_informative=2,
                           random_state=1, n_clusters_per_class=1)

rng = np.random.RandomState(2)
X += 2 * rng.uniform(size=X.shape)
linearly_separable = (X, y)

data_set = make_moons(noise=0.3, random_state=0)

X, y = data_set
X = StandardScaler().fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4, random_state=42)
from sklearn.neighbors import KNeighborsClassifier
from yellowbrick.contrib.classifier import DecisionViz

viz = DecisionViz(KNeighborsClassifier(3), title="Nearest Neighbors", features=['Feature One', 'Feature Two'], classes=['A', 'B'])
viz.fit(X_train, y_train)
viz.draw(X_test, y_test)
viz.poof(outpath="images/knn_decisionviz.png")
../../_images/knn_decisionviz.png

API Reference

class yellowbrick.contrib.classifier.boundaries.DecisionBoundariesVisualizer(**kwargs)[source]

Bases: yellowbrick.classifier.base.ClassificationScoreVisualizer

DecisionBoundariesVisualizer is a bivariate data visualization algorithm that plots the decision boundaries of each class.

Parameters:
model : the Scikit-Learn estimator

Should be an instance of a classifier, else the __init__ will return an error.

x : string, default: None

The feature name that corresponds to a column name or index postion in the matrix that will be plotted against the x-axis

y : string, default: None

The feature name that corresponds to a column name or index postion in the matrix that will be plotted against the y-axis

classes : a list of class names for the legend, default: None

If classes is None and a y value is passed to fit then the classes are selected from the target vector.

features : list of strings, default: None

The names of the features or columns

show_scatter : boolean, default: True

If boolean is True, then a scatter plot with points will be drawn on top of the decision boundary graph

step_size : float percentage, default: 0.0025

Determines the step size for creating the numpy meshgrid that will later become the foundation of the decision boundary graph. The default value of 0.0025 means that the step size for constructing the meshgrid will be 0.25%% of differenes of the max and min of x and y for each feature.

markers : iterable of strings, default: ,od*vh+

Matplotlib style markers for points on the scatter plot points

pcolormesh_alpha : float, default: 0.8

Sets the alpha transparency for the meshgrid of model boundaries

scatter_alpha : float, default: 1.0

Sets the alpha transparency for the scatter plot points

title : string, default: stringified feature_one and feature_two

Sets the title of the visualization

kwargs : keyword arguments passed to the super class.
These parameters can be influenced later on in the visualization
process, but can and should be set as early as possible.
draw(X, y=None, **kwargs)[source]

Called from the fit method, this method creates a decision boundary plot, and if self.scatter is True, it will scatter plot that draws each instance as a class or target colored point, whose location is determined by the feature data set.

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:
kwargs: generic keyword arguments.
fit(X, y=None, **kwargs)[source]

The fit method is the primary drawing input for the decision boundaries visualization since it has both the X and y data required for the viz and the transform method does not.

Parameters:
X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs : dict

Pass generic arguments to the drawing method

Returns:
self : instance

Returns the instance of the visualizer

fit_draw(X, y=None, **kwargs)[source]

Fits a transformer to X and y then returns visualization of features or fitted model.

fit_draw_poof(X, y=None, **kwargs)[source]

Fits a transformer to X and y then returns visualization of features or fitted model. Then calls poof to finalize.