sjc.cluster.RdCompute hierarchical or kmeans cluster analysis and return the group association for each observation as vector.
sjc.cluster(data, groupcount = NULL, method = c("hclust", "kmeans"), distance = c("euclidean", "maximum", "manhattan", "canberra", "binary", "minkowski"), agglomeration = c("ward", "ward.D", "ward.D2", "single", "complete", "average", "mcquitty", "median", "centroid"), iter.max = 20, algorithm = c("Hartigan-Wong", "Lloyd", "MacQueen"))
| data | A data frame with variables that should be used for the cluster analysis. |
|---|---|
| groupcount | Amount of groups (clusters) used for the cluster solution. May also be
a set of initial (distinct) cluster centres, in case
|
| method | Method for computing the cluster analysis. By default ( |
| distance | Distance measure to be used when |
| agglomeration | Agglomeration method to be used when |
| iter.max | Maximum number of iterations allowed. Only applies, if
|
| algorithm | Algorithm used for calculating kmeans cluster. Only applies, if
|
The group classification for each observation as vector. This group
classification can be used for sjc.grpdisc-function to
check the goodness of classification.
The returned vector includes missing values, so it can be appended
to the original data frame data.
Since R version > 3.0.3, the "ward" option has been replaced by
either "ward.D" or "ward.D2", so you may use one of
these values. When using "ward", it will be replaced by "ward.D2".
To get similar results as in SPSS Quick Cluster function, following points
have to be considered:
Use the /PRINT INITIAL option for SPSS Quick Cluster to get a table with initial cluster centers.
Create a matrix of this table, by consecutively copying the values, one row after another, from the SPSS output into a matrix and specify nrow and ncol arguments.
Use algorithm="Lloyd".
Use the same amount of iter.max both in SPSS and this sjc.qclus.
This ensures a fixed initial set of cluster centers (as in SPSS), while kmeans in R
always selects initial cluster sets randomly.
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2014) cluster: Cluster Analysis Basics and Extensions. R package.
# Hierarchical clustering of mtcars-dataset groups <- sjc.cluster(mtcars, 5) # K-means clustering of mtcars-dataset groups <- sjc.cluster(mtcars, 5, method="k")