Clustering

Simple K-Means

The goal of K-means clustering is to determine k clusters in such a way that intra cluster distances are small and inter cluster distances are large; or in other words, every point is assigned to a cluster whose centre is the nearest. K-means clustering works by randomly choosing k-centroids in the first step and then assigning the data points to the clusters in such a way that every point belongs to the cluster with the nearest centroid, and redetermining the cluster centroids by taking the mean of data points in each cluster. The process is continued until the cluster means converge.

Method parameters

Data files
Raw data files correspondent to the samples selected to bi in the projection plot.
Colouring style
The dots corresponding to every sample can be colored depending on the sample's parameter state or on the file.
Peak measuring approach
It can take two values: height or area. The projections will be calculated using one of this two values.
Peaks
Peaks that will be taken into account to create the projection plot.
Visualization
The visualization of the result of non hierarchical clustering algorithms can be performed using PCA or Sammon's projection
Type of data
It can take two values: Samples or variables. The clustering will be applied to one of this types of data.
Algorithm
Algorithm that will be used to cluster the data.
Link type
This parameters is only enable when the hierarchical clustering has been chosen. The distances between clusters is determined by the chosen linkage.
Distance fuction
This parameters is only enable when the hierarchical clustering has been chosen. The distances between points is determined by the chosen distance function.
Number of groups
The number of clusters has to be defined by the user in advance for some clustering algorithms. This parameter is available only when K-means or Farthest First algorithm are chosen.