Clustering
Hierarchical clustering
Hierarchical clustering builds a hierarchy of clusters. It is either achieved using Agglomerative clustering, in which initially every point belongs to a distinct cluster and the clusters are combined with the nearest clusters iteratively; or by dividing clusters (Divisive) starting from one single cluster containing all data points, until every singe point belongs to a separate cluster. The distances between points maybe determined using e.g. Euclidean, Minkowski or Manhattan distance; and the distances between clustered maybe determined by single linkage (minimum distance between all pairs of points between the clusters), complete linkage (maximum distance between all pairs of points between clusters), and so on. Determining the number of clusters is done by setting a length to "cut" the hierarchical clustering tree, but hierarchical clustering is more commonly used as a tool for visualizing the patterns of neighbourhood.
Method parameters
- Data files
- Raw data files correspondent to the samples selected to bi in the projection plot.
- Colouring style
- The dots corresponding to every sample can be colored depending on the sample's parameter state or on the file.
- Peak measuring approach
- It can take two values: height or area. The projections will be calculated using one of this two values.
- Peaks
- Peaks that will be taken into account to create the projection plot.
- Visualization
- The visualization of the result of non hierarchical clustering algorithms can be performed using PCA or Sammon's projection
- Type of data
- It can take two values: Samples or variables. The clustering will be applied to one of this types of data.
- Algorithm
- Algorithm that will be used to cluster the data.
- Link type
- This parameters is only enable when the hierarchical clustering has been chosen. The distances between clusters is determined by the chosen linkage.
- Distance fuction
- This parameters is only enable when the hierarchical clustering has been chosen. The distances between points is determined by the chosen distance function.
- Number of groups
- The number of clusters has to be defined by the user in advance for some clustering algorithms. This parameter is available only when K-means or Farthest First algorithm are chosen.