geosnap.data.Community.cluster

Community.cluster(self, n_clusters=6, method=None, best_model=False, columns=None, verbose=False, return_model=False, scaler=None, **kwargs)[source]

Create a geodemographic typology by running a cluster analysis on the study area’s neighborhood attributes

Parameters
gdfpandas.DataFrame

long-form (geo)DataFrame containing neighborhood attributes

n_clustersint

the number of clusters to model. The default is 6).

methodstr

the clustering algorithm used to identify neighborhood types

best_modelbool

if using a gaussian mixture model, use BIC to choose the best n_clusters. (the default is False).

columnslist-like

subset of columns on which to apply the clustering

verbosebool

whether to print warning messages (the default is False).

return_modelbool

whether to return the underlying cluster model instance for further analysis

scaler: str or sklearn.preprocessing.Scaler

a scikit-learn preprocessing class that will be used to rescale the data. Defaults to StandardScaler

Returns
pandas.DataFrame with a column of neighborhood cluster labels appended
as a new column. Will overwrite columns of the same name.