MissingValues Bar

The MissingValues Bar visualizer creates a bar graph that counts the number of missing values per feature column.

If the target y is supplied to fit, then produces a stacked bar chart.

Setup

import numpy as np
from sklearn.datasets import make_classification

X, y = make_classification(
        n_samples=400, n_features=10, n_informative=2, n_redundant=3,
        n_classes=2, n_clusters_per_class=2, random_state=854
    )
# assign some NaN values
X[X > 1.5] = np.nan
features = ["Feature {}".format(str(n)) for n in range(10)]

Without Targets Supplied

from yellowbrick.contrib.missing import MissingValuesBar

viz = MissingValuesBar(features=features)
viz.fit(X)
viz.poof()
../../../_images/missingbar.png

With Targets (y) Supplied

from yellowbrick.contrib.missing import MissingValuesBar

viz = MissingValuesBar(features=features)
viz.fit(X, y=y) # supply the targets via y
viz.poof()
../../../_images/missingbar_with_targets.png

API Reference

Bar visualizer of missing values by column.

class yellowbrick.contrib.missing.bar.MissingValuesBar(width=0.5, color='black', colors=None, classes=None, **kwargs)[source]

Bases: yellowbrick.contrib.missing.base.MissingDataVisualizer

The MissingValues Bar visualizer creates a bar graph that lists the total count of missing values for each selected feature column.

When y targets are supplied to fit, the output is a stacked bar chart where each color corresponds to the total NaNs for the feature in that column.

Parameters:
alpha : float, default: 0.5

A value for bending elments with the background.

marker : matplotlib marker, default: |

The marker used for each element coordinate in the plot

color : string, default: black

The color for drawing the bar chart when the y targets are not passed to fit.

colors : list, default: None

The color pallette for drawing a stack bar chart when the y targets are passed to fit.

classes : list, default: None

A list of class names for the legend. If classes is None and a y value is passed to fit then the classes are selected from the target vector.

kwargs : dict

Keyword arguments that are passed to the base class and may influence the visualization as defined in other Visualizers.

Examples

>>> from yellowbrick.contrib.missing import MissingValuesBar
>>> visualizer = MissingValuesBar()
>>> visualizer.fit(X, y=y)
>>> visualizer.poof()
Attributes:
features_ : np.array

The feature labels ranked according to their importance

classes_ : np.array

The class labels for each of the target values

draw(X, y, **kwargs)[source]

Called from the fit method, this method generated a horizontal bar plot.

If y is none, then draws a simple horizontal bar chart. If y is not none, then draws a stacked horizontal bar chart for each nan count per target values.

draw_stacked_bar(nan_col_counts)[source]

Draws a horizontal stacked bar chart with different colors for each count of nan values per label.

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:
kwargs: generic keyword arguments.
get_nan_col_counts(**kwargs)[source]