This dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.

Samples total





real, positive


ints, 1-977




Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic, environmental and health issues.

Apart from interesting real world applications of bike sharing systems, the characteristics of data being generated by these systems make them attractive for the research. Opposed to other transport services such as bus or subway, the duration of travel, departure and arrival position is explicitly recorded in these systems. This feature turns bike sharing system into a virtual sensor network that can be used for sensing mobility in the city. Hence, it is expected that most of important events in the city could be detected via monitoring these data.


Downloaded from the UCI Machine Learning Repository on May 4, 2017.

Fanaee-T, Hadi, and Gama, Joao, ‘Event labeling combining ensemble detectors and background knowledge’, Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg


yellowbrick.datasets.loaders.load_bikeshare(data_home=None, return_dataset=False)[source]

Loads the bike sharing univariate dataset that is well suited to regression tasks. The dataset contains 17379 instances with 12 integer and real valued attributes and a continuous target.

The Yellowbrick datasets are hosted online and when requested, the dataset is downloaded to your local computer for use. Note that if the dataset hasn’t been downloaded before, an Internet connection is required. However, if the data is cached locally, no data will be downloaded. Yellowbrick checks the known signature of the dataset with the data downloaded to ensure the download completes successfully.

Datasets are stored alongside the code, but the location can be specified with the data_home parameter or the $YELLOWBRICK_DATA envvar.

data_homestr, optional

The path on disk where data is stored. If not passed in, it is looked up from $YELLOWBRICK_DATA or the default returned by get_data_home.

return_datasetbool, default=False

Return the raw dataset object instead of X and y numpy arrays to get access to alternative targets, extra features, content and meta.

Xarray-like with shape (n_instances, n_features) if return_dataset=False

A pandas DataFrame or numpy array describing the instance features.

yarray-like with shape (n_instances,) if return_dataset=False

A pandas Series or numpy array describing the target vector.

datasetDataset instance if return_dataset=True

The Yellowbrick Dataset object provides an interface to accessing the data in a variety of formats as well as associated metadata and content.