yellowbrick.classifier package

Submodules

yellowbrick.classifier.base module

class yellowbrick.classifier.base.ClassificationScoreVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.ScoreVisualizer

class_counts(y)[source]

yellowbrick.classifier.class_balance module

class yellowbrick.classifier.class_balance.ClassBalance(model, ax=None, classes=None, **kwargs)[source]

Bases: yellowbrick.classifier.base.ClassificationScoreVisualizer

Class balance chart that shows the support for each class in the fitted classification model displayed as a bar plot. It is initialized with a fitted model and generates a class balance chart on draw.

Parameters:

ax: axes

the axis to plot the figure on.

model: estimator

Scikit-Learn estimator object. Should be an instance of a classifier, else __init__() will raise an exception.

classes: list

A list of class names for the legend. If classes is None and a y value is passed to fit then the classes are selected from the target vector.

kwargs: dict

Keyword arguments passed to the super class. Here, used to colorize the bars in the histogram.

These parameters can be influenced later on in the visualization

process, but can and should be set as early as possible.

draw()[source]

Renders the class balance chart across the axis.

Returns:ax : the axis with the plotted figure
finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: keyword arguments passed to Scikit-Learn API.

Returns:

self : instance

Returns the instance of the classification score visualizer

score(X, y=None, **kwargs)[source]

Generates the Scikit-Learn precision_recall_fscore_support

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

Returns:

ax : the axis with the plotted figure

yellowbrick.classifier.class_balance.class_balance(model, X, y=None, ax=None, classes=None, **kwargs)[source]

Quick method:

Displays the support for each class in the fitted classification model displayed as a bar plot.

This helper function is a quick wrapper to utilize the ClassBalance ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a classifier)

classes : list of strings

The names of the classes in the target

Returns:

ax : matplotlib axes

Returns the axes that the class balance plot was drawn on.

yellowbrick.classifier.classification_report module

class yellowbrick.classifier.classification_report.ClassificationReport(model, ax=None, classes=None, **kwargs)[source]

Bases: yellowbrick.classifier.base.ClassificationScoreVisualizer

Classification report that shows the precision, recall, and F1 scores for the model. Integrates numerical scores as well as a color-coded heatmap.

Parameters:

ax : The axis to plot the figure on.

model : the Scikit-Learn estimator

Should be an instance of a classifier, else the __init__ will return an error.

classes : a list of class names for the legend

If classes is None and a y value is passed to fit then the classes are selected from the target vector.

colormap : optional string or matplotlib cmap to colorize lines

Use sequential heatmap.

kwargs : keyword arguments passed to the super class.

Examples

>>> from yellowbrick.classifier import ClassificationReport
>>> from sklearn.linear_model import LogisticRegression
>>> viz = ClassificationReport(LogisticRegression())
>>> viz.fit(X_train, y_train)
>>> viz.score(X_test, y_test)
>>> viz.poof()
draw(y, y_pred)[source]

Renders the classification report across each axis.

Parameters:

y : ndarray or Series of length n

An array or series of target or class values

y_pred : ndarray or Series of length n

An array or series of predicted target values

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: keyword arguments passed to Scikit-Learn API.

score(X, y=None, **kwargs)[source]

Generates the Scikit-Learn classification_report

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

yellowbrick.classifier.classification_report.classification_report(model, X, y=None, ax=None, classes=None, **kwargs)[source]

Quick method:

Displays precision, recall, and F1 scores for the model. Integrates numerical scores as well color-coded heatmap.

This helper function is a quick wrapper to utilize the ClassificationReport ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a classifier)

classes : list of strings

The names of the classes in the target

Returns:

ax : matplotlib axes

Returns the axes that the classification report was drawn on.

yellowbrick.classifier.confusion_matrix module

class yellowbrick.classifier.confusion_matrix.ConfusionMatrix(model, ax=None, classes=None, **kwargs)[source]

Bases: yellowbrick.classifier.base.ClassificationScoreVisualizer

Creates a heatmap visualization of the sklearn.metrics.confusion_matrix(). A confusion matrix shows each combination of the true and predicted classes for a test data set.

The default color map uses a yellow/orange/red color scale. The user can choose between displaying values as the percent of true (cell value divided by sum of row) or as direct counts. If percent of true mode is selected, 100% accurate predictions are highlighted in green.

Requires a classification model

Parameters:

model : the Scikit-Learn estimator

Should be an instance of a classifier or __init__ will return an error.

ax : the matplotlib axis to plot the figure on (if None, a new axis will be created)

classes : a list of class names to use in the confusion_matrix.

This is passed to the ‘labels’ parameter of sklearn.metrics.confusion_matrix(), and follows the behaviour indicated by that function. It may be used to reorder or select a subset of labels. If None, values that appear at least once in y_true or y_pred are used in sorted order. Default: None

Examples

>>> from yellowbrick.classifier import ConfusionMatrix
>>> from sklearn.linear_model import LogisticRegression
>>> viz = ConfusionMatrix(LogisticRegression())
>>> viz.fit(X_train, y_train)
>>> viz.score(X_test, y_test)
>>> viz.poof()
classes

Returns a numpy array of the classes in y Matches the user provided list if provided by the user in __init__ If no list provided, tries to obtain it from the fitted estimator

draw(percent=True)[source]

Renders the classification report Should only be called internally, as it uses values calculated in Score and score calls this method.

Parameters:

percent: Boolean

Whether the heatmap should represent “% of True” or raw counts

finalize(**kwargs)[source]
fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: keyword arguments passed to Scikit-Learn API.

score(X, y, sample_weight=None, percent=True)[source]

Generates the Scikit-Learn confusion_matrix and applies this to the appropriate axis

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

sample_weight: optional, passed to the confusion_matrix

percent: optional, Boolean. Determines whether or not the confusion_matrix

should be displayed as raw numbers or as a percent of the true predictions. Note, if using a subset of classes in __init__, percent should be set to False or inaccurate percents will be displayed.

yellowbrick.classifier.rocauc module

class yellowbrick.classifier.rocauc.ROCAUC(model, ax=None, **kwargs)[source]

Bases: yellowbrick.classifier.base.ClassificationScoreVisualizer

Plot the ROC to visualize the tradeoff between the classifier’s sensitivity and specificity.

Parameters:

ax : the axis to plot the figure on.

model : the Scikit-Learn estimator

Should be an instance of a classifier, else the __init__ will return an error.

roc_color : color of the ROC curve

Specify the color as a matplotlib color: you can specify colors in many weird and wonderful ways, including full names (‘green’), hex strings (‘#008000’), RGB or RGBA tuples ((0,1,0,1)) or grayscale intensities as a string (‘0.8’).

diagonal_color : color of the diagonal

Specify the color as a matplotlib color.

kwargs : keyword arguments passed to the super class.

Currently passing in hard-coded colors for the Receiver Operating Characteristic curve and the diagonal. These will be refactored to a default Yellowbrick style.

These parameters can be influenced later on in the visualization

process, but can and should be set as early as possible.

Examples

>>> from yellowbrick.classifier import ROCAUC
>>> from sklearn.linear_model import LogisticRegression
>>> logistic = LogisticRegression()
>>> viz = ROCAUC(logistic)
>>> viz.fit(X_train, y_train)
>>> viz.score(X_test, y_test)
>>> viz.poof()
draw(y, y_pred)[source]

Renders ROC-AUC plot. Called internally by score, possibly more than once

Parameters:

y : ndarray or Series of length n

An array or series of target or class values

y_pred : ndarray or Series of length n

An array or series of predicted target values

Returns

——

ax : the axis with the plotted figure

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
score(X, y=None, **kwargs)[source]

Generates the predicted target values using the Scikit-Learn estimator.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

Returns

——

ax : the axis with the plotted figure

yellowbrick.classifier.rocauc.roc_auc(model, X, y=None, ax=None, **kwargs)[source]

Quick method:

Displays the tradeoff between the classifier’s sensitivity and specificity.

This helper function is a quick wrapper to utilize the ROCAUC ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a classifier)

Returns:

ax : matplotlib axes

Returns the axes that the roc-auc curve was drawn on.