`sklearn.cluster`.OPTICS¶

class sklearn.cluster.OPTICS(min_samples=5, max_eps=inf, metric='minkowski', p=2, metric_params=None, cluster_method='xi', eps=None, xi=0.05, predecessor_correction=True, min_cluster_size=None, algorithm='auto', leaf_size=30, n_jobs=None)[source]¶

Estimate clustering structure from vector array

OPTICS (Ordering Points To Identify the Clustering Structure), closely related to DBSCAN, finds core sample of high density and expands clusters from them [R2c55e37003fe-1]. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Better suited for usage on large datasets than the current sklearn implementation of DBSCAN.

Clusters are then extracted using a DBSCAN-like method (cluster_method = ‘dbscan’) or an automatic technique proposed in [R2c55e37003fe-1] (cluster_method = ‘xi’).

This implementation deviates from the original OPTICS by first performing k-nearest-neighborhood searches on all points to identify core sizes, then computing only the distances to unprocessed points when constructing the cluster order. Note that we do not employ a heap to manage the expansion candidates, so the time complexity will be O(n^2).

See also

DBSCAN: A similar clustering for a specified neighborhood radius (eps). Our implementation is optimized for runtime.

References

R2c55e37003fe-1(1,2): Ankerst, Mihael, Markus M. Breunig, Hans-Peter Kriegel, and Jörg Sander. “OPTICS: ordering points to identify the clustering structure.” ACM SIGMOD Record 28, no. 2 (1999): 49-60.
R2c55e37003fe-2: Schubert, Erich, Michael Gertz. “Improving the Cluster Structure Extracted from OPTICS Plots.” Proc. of the Conference “Lernen, Wissen, Daten, Analysen” (LWDA) (2018): 318-329.

Methods

`fit`(self, X[, y])	Perform OPTICS clustering
`fit_predict`(self, X[, y])	Performs clustering on X and returns cluster labels.
`get_params`(self[, deep])	Get parameters for this estimator.
`set_params`(self, \\params)	Set the parameters of this estimator.

__init__(self, min_samples=5, max_eps=inf, metric='minkowski', p=2, metric_params=None, cluster_method='xi', eps=None, xi=0.05, predecessor_correction=True, min_cluster_size=None, algorithm='auto', leaf_size=30, n_jobs=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

fit(self, X, y=None)[source]¶

Perform OPTICS clustering

Extracts an ordered list of points and reachability distances, and performs initial clustering using max_eps distance specified at OPTICS object instantiation.

Parameters

Xarray, shape (n_samples, n_features), or (n_samples, n_samples) if metric=’precomputed’.: A feature array, or array of distances between samples if metric=’precomputed’.
yignored

Returns

selfinstance of OPTICS: The instance.

fit_predict(self, X, y=None)[source]¶

Performs clustering on X and returns cluster labels.

Parameters

Xndarray, shape (n_samples, n_features): Input data.
yIgnored: not used, present for API consistency by convention.

Returns

labelsndarray, shape (n_samples,): cluster labels

get_params(self, deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepboolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

set_params(self, **params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self

sklearn.cluster.OPTICS¶

`sklearn.cluster`.OPTICS¶