geosnap.analyze.cluster module

geosnap.analyze.cluster.affinity_propagation(X, damping=0.8, preference=-1000, max_iter=500, convergence_iter=15, copy=True, affinity='euclidean', verbose=False, **kwargs)[source]

Clustering with Affinity Propagation.

Parameters
Xarray-like

n x k attribute data

preferencearray-like, shape (n_samples,) or float, optional,

default: None

The preference parameter passed to scikit-learn’s affinity propagation algorithm

damping: float, optional, default: 0.8

The damping parameter passed to scikit-learn’s affinity propagation algorithm

max_iterint, optional, default: 1000

Maximum number of iterations

Returns
model: sklearn AffinityPropagation instance
geosnap.analyze.cluster.azp(X, w, n_clusters=5, **kwargs)[source]

AZP clustering algorithm

Parameters
Xarray-like

n x k attribute data

wPySAL W instance

spatial weights matrix

n_clustersint, optional, default: 5

The number of clusters to form.

Returns
model: region AZP instance
geosnap.analyze.cluster.gaussian_mixture(X, n_clusters=5, covariance_type='full', best_model=False, max_clusters=10, random_state=None, **kwargs)[source]

Clustering with Gaussian Mixture Model

Parameters
Xarray-like

n x k attribute data

n_clustersint, optional, default: 5

The number of clusters to form.

covariance_type: str, optional, default: “full”“

The covariance parameter passed to scikit-learn’s GaussianMixture algorithm

best_model: bool, optional, default: False

Option for finding endogenous K according to Bayesian Information Criterion

max_clusters: int, optional, default:10

The max number of clusters to test if using best_model option

random_state: int, optional, default: None

The seed used to generate replicable results

Returns
model: sklearn GaussianMixture instance
geosnap.analyze.cluster.hdbscan(X, min_cluster_size=5, gen_min_span_tree=True, **kwargs)[source]

Clustering with Hierarchical DBSCAN

Parameters
Xarray-like

n x k attribute data

min_cluster_sizeint, default: 5

the minimum number of points necessary to generate a cluster

gen_min_span_treebool

Description of parameter gen_min_span_tree (the default is True).

Returns
model: hdbscan HDBSCAN instance
geosnap.analyze.cluster.kmeans(X, n_clusters, init='k-means++', n_init=10, max_iter=300, tol=0.0001, verbose=0, random_state=None, copy_x=True, n_jobs=None, algorithm='auto', precompute_distances='auto', **kwargs)[source]

K-Means clustering.

Parameters
Xarray-like

n x k attribute data

n_clustersint, optional, default: 8

The number of clusters to form as well as the number of centroids to generate.

Returns
model: sklearn KMeans instance
geosnap.analyze.cluster.max_p(X, w, threshold_variable='count', threshold=10, **kwargs)[source]

Max-p clustering algorithm [DAR12]

Parameters
Xarray-like

n x k attribute data

wPySAL W instance

spatial weights matrix

threshold_variablestr, default:”count”

attribute variable to use as floor when calculate

thresholdint, default:10

integer that defines the upper limit of a variable that can be grouped into a single region

Returns
model: region MaxPRegionsHeu instance
geosnap.analyze.cluster.skater(X, w, n_clusters=5, floor=-inf, trace=False, islands='increase', **kwargs)[source]

SKATER spatial clustering algorithm.

Parameters
Xarray-like

n x k attribute data

wPySAL W instance

spatial weights matrix

n_clustersint, optional, default: 5

The number of clusters to form.

floortype

TODO.

tracetype

TODO.

islandstype

TODO.

Returns
model: skater SKATER instance
geosnap.analyze.cluster.spectral(X, n_clusters, eigen_solver=None, random_state=None, n_init=10, gamma=1.0, affinity='rbf', n_neighbors=10, eigen_tol=0.0, assign_labels='kmeans', degree=3, coef0=1, kernel_params=None, n_jobs=-1, **kwargs)[source]

Short summary.

Parameters
Xarral-like

n x k attribute data

n_clusterstype

The number of clusters to form as well as the number of centroids to generate.

eigen_solvertype

Description of parameter eigen_solver (the default is None).

random_statetype

Description of parameter random_state (the default is None).

n_inittype

Description of parameter n_init (the default is 10).

gammatype

Description of parameter gamma (the default is 1.0).

affinitytype

Description of parameter affinity (the default is ‘rbf’).

n_neighborstype

Description of parameter n_neighbors (the default is 10).

eigen_toltype

Description of parameter eigen_tol (the default is 0.0).

assign_labelstype

Description of parameter assign_labels (the default is ‘kmeans’).

degreetype

Description of parameter degree (the default is 3).

coef0type

Description of parameter coef0 (the default is 1).

kernel_paramstype

Description of parameter kernel_params (the default is None).

n_jobstype

Description of parameter n_jobs (the default is -1).

**kwargstype

Description of parameter **kwargs.

Returns
model: sklearn SpectralClustering instance
geosnap.analyze.cluster.spenc(X, w, n_clusters=5, gamma=1, **kwargs)[source]

Spatially encouraged spectral clustering

[wolf2018]

Parameters
Xarray-like

n x k attribute data

wPySAL W instance

spatial weights matrix

n_clustersint, optional, default: 5

The number of clusters to form.

gammaint, default:1

TODO.

Returns
model: spenc SPENC instance
geosnap.analyze.cluster.ward(X, n_clusters=5, **kwargs)[source]

Agglomerative clustering using Ward linkage.

Parameters
Xarray-like

n x k attribute data

n_clustersint, optional, default: 8

The number of clusters to form.

Returns
model: sklearn AgglomerativeClustering instance
geosnap.analyze.cluster.ward_spatial(X, w, n_clusters=5, **kwargs)[source]
Agglomerative clustering using Ward linkage with a spatial connectivity

constraint

Parameters
Xarray-like

n x k attribute data

wPySAL W instance

spatial weights matrix

n_clustersint, optional, default: 5

The number of clusters to form.

Returns
model: sklearn AgglomerativeClustering instance