Clustering

Hierarchical clustering

Hierarchical clustering builds a hierarchy of clusters. It is either achieved using Agglomerative clustering, in which initially every point belongs to a distinct cluster and the clusters are combined with the nearest clusters iteratively; or by dividing clusters (Divisive) starting from one single cluster containing all data points, until every singe point belongs to a separate cluster. The distances between points maybe determined using e.g. Euclidean, Minkowski or Manhattan distance; and the distances between clustered maybe determined by single linkage (minimum distance between all pairs of points between the clusters), complete linkage (maximum distance between all pairs of points between clusters), and so on. Determining the number of clusters is done by setting a length to "cut" the hierarchical clustering tree, but hierarchical clustering is more commonly used as a tool for visualizing the patterns of neighbourhood.

Method parameters

Data files
Raw data files correspondent to the samples selected to bi in the projection plot.
Colouring style
The dots corresponding to every sample can be colored depending on the sample's parameter state or on the file.
Peak measuring approach
It can take two values: height or area. The projections will be calculated using one of this two values.
Peaks
Peaks that will be taken into account to create the projection plot.
Visualization
The visualization of the result of non hierarchical clustering algorithms can be performed using PCA or Sammon's projection
Type of data
It can take two values: Samples or variables. The clustering will be applied to one of this types of data.
Algorithm
Algorithm that will be used to cluster the data.
Link type
This parameters is only enable when the hierarchical clustering has been chosen. The distances between clusters is determined by the chosen linkage.
Distance fuction
This parameters is only enable when the hierarchical clustering has been chosen. The distances between points is determined by the chosen distance function.
Number of groups
The number of clusters has to be defined by the user in advance for some clustering algorithms. This parameter is available only when K-means or Farthest First algorithm are chosen.