sklearn.dummy.DummyClassifier

class sklearn.dummy.DummyClassifier(strategy='stratified', random_state=None, constant=None)[source]

DummyClassifier is a classifier that makes predictions using simple rules.

This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems.

Read more in the User Guide.

Parameters
strategystr, default=”stratified”

Strategy to use to generate predictions.

  • “stratified”: generates predictions by respecting the training set’s class distribution.

  • “most_frequent”: always predicts the most frequent label in the training set.

  • “prior”: always predicts the class that maximizes the class prior (like “most_frequent”) and predict_proba returns the class prior.

  • “uniform”: generates predictions uniformly at random.

  • “constant”: always predicts a constant label that is provided by the user. This is useful for metrics that evaluate a non-majority class

    New in version 0.17: Dummy Classifier now supports prior fitting strategy using parameter prior.

random_stateint, RandomState instance or None, optional, default=None

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

constantint or str or array of shape = [n_outputs]

The explicit constant as predicted by the “constant” strategy. This parameter is useful only for the “constant” strategy.

Attributes
classes_array or list of array of shape = [n_classes]

Class labels for each output.

n_classes_array or list of array of shape = [n_classes]

Number of label for each output.

class_prior_array or list of array of shape = [n_classes]

Probability of each class for each output.

n_outputs_int,

Number of outputs.

sparse_output_bool,

True if the array returned from predict is to be in sparse CSC format. Is automatically set to True if the input y is passed in sparse format.

Methods

fit(self, X, y[, sample_weight])

Fit the random classifier.

get_params(self[, deep])

Get parameters for this estimator.

predict(self, X)

Perform classification on test vectors X.

predict_log_proba(self, X)

Return log probability estimates for the test vectors X.

predict_proba(self, X)

Return probability estimates for the test vectors X.

score(self, X, y[, sample_weight])

Returns the mean accuracy on the given test data and labels.

set_params(self, \*\*params)

Set the parameters of this estimator.

__init__(self, strategy='stratified', random_state=None, constant=None)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(self, X, y, sample_weight=None)[source]

Fit the random classifier.

Parameters
X{array-like, object with finite length or shape}

Training data, requires length = n_samples

yarray-like, shape = [n_samples] or [n_samples, n_outputs]

Target values.

sample_weightarray-like of shape = [n_samples], optional

Sample weights.

Returns
selfobject
get_params(self, deep=True)[source]

Get parameters for this estimator.

Parameters
deepboolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

predict(self, X)[source]

Perform classification on test vectors X.

Parameters
X{array-like, object with finite length or shape}

Training data, requires length = n_samples

Returns
yarray, shape = [n_samples] or [n_samples, n_outputs]

Predicted target values for X.

predict_log_proba(self, X)[source]

Return log probability estimates for the test vectors X.

Parameters
X{array-like, object with finite length or shape}

Training data, requires length = n_samples

Returns
Parray-like or list of array-like of shape = [n_samples, n_classes]

Returns the log probability of the sample for each class in the model, where classes are ordered arithmetically for each output.

predict_proba(self, X)[source]

Return probability estimates for the test vectors X.

Parameters
X{array-like, object with finite length or shape}

Training data, requires length = n_samples

Returns
Parray-like or list of array-lke of shape = [n_samples, n_classes]

Returns the probability of the sample for each class in the model, where classes are ordered arithmetically, for each output.

score(self, X, y, sample_weight=None)[source]

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters
X{array-like, None}

Test samples with shape = (n_samples, n_features) or None. Passing None as test samples gives the same result as passing real test samples, since DummyClassifier operates independently of the sampled observations.

yarray-like, shape = (n_samples) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like, shape = [n_samples], optional

Sample weights.

Returns
scorefloat

Mean accuracy of self.predict(X) wrt. y.

set_params(self, **params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns
self