sklearn.random_projection
.GaussianRandomProjection¶
-
class
sklearn.random_projection.
GaussianRandomProjection
(n_components='auto', eps=0.1, random_state=None)[source]¶ Reduce dimensionality through Gaussian random projection
The components of the random matrix are drawn from N(0, 1 / n_components).
Read more in the User Guide.
- Parameters
- n_componentsint or ‘auto’, optional (default = ‘auto’)
Dimensionality of the target projection space.
n_components can be automatically adjusted according to the number of samples in the dataset and the bound given by the Johnson-Lindenstrauss lemma. In that case the quality of the embedding is controlled by the
eps
parameter.It should be noted that Johnson-Lindenstrauss lemma can yield very conservative estimated of the required number of components as it makes no assumption on the structure of the dataset.
- epsstrictly positive float, optional (default=0.1)
Parameter to control the quality of the embedding according to the Johnson-Lindenstrauss lemma when n_components is set to ‘auto’.
Smaller values lead to better embedding and higher number of dimensions (n_components) in the target projection space.
- random_stateint, RandomState instance or None, optional (default=None)
Control the pseudo random number generator used to generate the matrix at fit time. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by
np.random
.
- Attributes
- n_components_int
Concrete number of components computed when n_components=”auto”.
- components_numpy array of shape [n_components, n_features]
Random matrix used for the projection.
See also
Examples
>>> import numpy as np >>> from sklearn.random_projection import GaussianRandomProjection >>> rng = np.random.RandomState(42) >>> X = rng.rand(100, 10000) >>> transformer = GaussianRandomProjection(random_state=rng) >>> X_new = transformer.fit_transform(X) >>> X_new.shape (100, 3947)
Methods
fit
(self, X[, y])Generate a sparse random projection matrix
fit_transform
(self, X[, y])Fit to data, then transform it.
get_params
(self[, deep])Get parameters for this estimator.
set_params
(self, \*\*params)Set the parameters of this estimator.
transform
(self, X)Project the data by using matrix product with the random matrix
-
__init__
(self, n_components='auto', eps=0.1, random_state=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
fit
(self, X, y=None)[source]¶ Generate a sparse random projection matrix
- Parameters
- Xnumpy array or scipy.sparse of shape [n_samples, n_features]
Training set: only the shape is used to find optimal random matrix dimensions based on the theory referenced in the afore mentioned papers.
- y
Ignored
- Returns
- self
-
fit_transform
(self, X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
- Xnumpy array of shape [n_samples, n_features]
Training set.
- ynumpy array of shape [n_samples]
Target values.
- Returns
- X_newnumpy array of shape [n_samples, n_features_new]
Transformed array.
-
get_params
(self, deep=True)[source]¶ Get parameters for this estimator.
- Parameters
- deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsmapping of string to any
Parameter names mapped to their values.
-
set_params
(self, **params)[source]¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Returns
- self
-
transform
(self, X)[source]¶ Project the data by using matrix product with the random matrix
- Parameters
- Xnumpy array or scipy.sparse of shape [n_samples, n_features]
The input data to project into a smaller dimensional space.
- Returns
- X_newnumpy array or scipy sparse of shape [n_samples, n_components]
Projected array.